knowledge based system

Published on February 2017 | Categories: Documents | Downloads: 62 | Comments: 0 | Views: 392
of 9
Download PDF   Embed   Report

Comments

Content

Knowledge-Based Systems 81 (2015) 56–64

Contents lists available at ScienceDirect

Knowledge-Based Systems
journal homepage: www.elsevier.com/locate/knosys

Computer-aided diagnosis of diabetic subjects by heart rate variability
signals using discrete wavelet transform method
U. Rajendra Acharya a,b, Vidya K. Sudarshan a,⇑, Dhanjoo N. Ghista c, Wei Jie Eugene Lim a, Filippo Molinari d,
Meena Sankaranarayanan e
a

Department of Electronics and Computer Engineering, Ngee Ann Polytechnic, Singapore 599489, Singapore
Department of Biomedical Engineering, Faculty of Engineering, University of Malaya, Malaysia
University 2020 Foundation, MA, USA
d
Biolab, Department of Electronics and Telecommunications, Politecnico di Torino, Torino, Italy
e
Department of Mathematics, Anand Institute of Higher Technology, Kazhipattur, Chennai 603 103, India
b
c

a r t i c l e

i n f o

Article history:
Received 9 June 2014
Received in revised form 3 February 2015
Accepted 5 February 2015
Available online 12 February 2015
Keywords:
Diabetes
HRV
Classifier
DWT
Feature extraction
Feature ranking

a b s t r a c t
Diabetes Mellitus (DM), a chronic lifelong condition, is characterized by increased blood sugar levels. As
there is no cure for DM, the major focus lies on controlling the disease. Therefore, DM diagnosis and treatment is of great importance. The most common complications of DM include retinopathy, neuropathy,
nephropathy and cardiomyopathy. Diabetes causes cardiovascular autonomic neuropathy that affects
the Heart Rate Variability (HRV). Hence, in the absence of other causes, the HRV analysis can be used
to diagnose diabetes. The present work aims at developing an automated system for classification of normal and diabetes classes by using the heart rate (HR) information extracted from the Electrocardiogram
(ECG) signals. The spectral analysis of HRV recognizes patients with autonomic diabetic neuropathy, and
gives an earlier diagnosis of impairment of the Autonomic Nervous System (ANS). Significant correlations
with the impaired ANS are observed of the HRV spectral indices obtained by using the Discrete Wavelet
Transform (DWT) method. Herein, in order to diagnose and detect DM automatically, we have performed
DWT decomposition up to 5 levels, and extracted the energy, sample entropy, approximation entropy,
kurtosis and skewness features at various detailed coefficient levels of the DWT. We have extracted
relative wavelet energy and entropy features up to the 5th level of DWT coefficients extracted from
HR signals. These features are ranked by using various ranking methods, namely, Bhattacharyya space
algorithm, t-test, Wilcoxon test, Receiver Operating Curve (ROC) and entropy.
The ranked features are then fed into different classifiers, that include Decision Tree (DT), K-Nearest
Neighbor (KNN), Naïve Bayes (NBC) and Support Vector Machine (SVM). Our results have shown maximum diagnostic differentiation performance by using a minimum number of features. With our system,
we have obtained an average accuracy of 92.02%, sensitivity of 92.59% and specificity of 91.46%, by using
DT classifier with ten-fold cross validation.
Ó 2015 Elsevier B.V. All rights reserved.

1. Introduction
According to the International Diabetes Federation (IDF), it is
estimated that in 2013 a total of 381 million people were diagnosed
with diabetes across the globe, out of which 23 million people are
from Southeast Asian countries [26]. Due to lack of finance or access
to healthcare, most of the populations around the world are unaware that they may be suffering from diabetes [26]. Statistics shows
that around 1.9 million people are diagnosed with diabetes in USA
every year and 79 million have pre-diabetic conditions [7]. By 2030,
⇑ Corresponding author. Tel.: +65 64608393.
E-mail address: [email protected] (K.S. Vidya).
http://dx.doi.org/10.1016/j.knosys.2015.02.005
0950-7051/Ó 2015 Elsevier B.V. All rights reserved.

the number of diabetes subjects is estimated to get almost double
(2.8% in 2000 and 4.4% in 2030), as its incidence is increasing rapidly every year Sarah et al. [47]. Diabetes and its complications have
shown a notable impact on individuals, families, and health systems and countries’ economy. The USA alone spends around
$245 billion annually on the diagnosed diabetes patients. It is predicted that by 2050, 1 in 3 Americans adults may have diabetes if
the current tendency is continued [7,10].
Diabetes mellitus (DM) is a condition that is defined by hyperglycemia state (blood glucose level), which in turn leads to
microvascular, and macrovascular damage [60]. Even though,
finding a cure for this DM condition is difficult, emphasis is laid
on early diagnosis of DM. In this regard, it is well known that a

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

person with diabetes exhibits autonomic neuropathy (AN), damage
to the nervous system or cardiovascular autonomic neuropathy
(CAN), a well-known complication of DM that affects the central
and peripheral vascular systems and causes abnormalities in the
heart rate signal [1]. Thus, diabetes can also be diagnosed by
studying the heart rate variability.
Concerning heart rate variability, the heart rate (HR), a
non-stationary/nonlinear signal, is obtained by calculating the time
elapsed between two ventricular contractions or the time between
two consecutive R-waves (R–R interval) on the ECG signals [27].
The HR Variability (HRV) is one of the reliable methods for
qualifying physiological dysfunction in terms of the condition of
sympathetic and parasympathetic nervous system [6,25]. The analysis of HRV enables us to evaluate overall cardiac health in terms
of the heart rate regulation, based on the status of the autonomic
nervous system responsible for regulating cardiac activity [37].
Spectral analysis of the short-term HRV enables quantitative
evaluation of the neurologic oscillations, and delivers values for
neural regulation of heart rate [9,51,34]. The spectral analysis of
HRV (spectral parameters like the power spectrum of HRV signal)
recognizes patients with autonomic diabetic neuropathy, and gives
an earlier diagnosis of impairment of the autonomic nervous
system (ANS) [15,21]. Significant correlations are observed
between impaired ANS and the HRV indices obtained by spectral
analyses using nonparametric and parametric methods namely,
fast Fourier Transform (FFT) and autoregressive (AR) method
respectively [12]. Time–frequency domain analysis of HRV makes
it easier to quantify the ANS activity in DM subjects [48]. Even
though the autonomic functions are better assessed by using frequency domain features, the accuracy of spectral power is limited
by the low level of the signal to noise ratio [6].
Nonlinear dynamic techniques are used in HRV signal analysis
to circumvent the limitations of time and frequency domain analysis [4]. Nonlinear methods are needed for the analysis of nonlinear
signals and systems [35]. The non-linear methods have been
applied in HRV analysis [50,30] to predict diabetes [14,5,20] and
cardiovascular disease (CVD) [23]. Nonlinear techniques can be
coupled to frequency analysis techniques. Among all of these
techniques, the DWT has the advantage of providing multiple
resolutions. This method provides discrimination between two
different signals with the same spectrum magnitude, thus distinguishing the subtle changes in the signals [17,2,56].
In normal and diabetic subjects, the HRV signal has been used to
study and measure the activity and symptoms of the cardiac
parasympathetic nervous system [41]. Their study reported that
diabetic subjects’ exhibit diminished cardiac parasympathetic
activity before the appearance of autonomic neuropathy symptoms. Several studies conducted (Table 3) have reported that
diabetic patients are characterized by reduced HRV, with less information about HRV across the spectrum of blood glucose levels. In
2000, Singh et al. [52] studied the correlation between hyperglycemia (increased blood glucose level) and reduced HRV. They
reported reduced HRV variables in DM subjects and in subjects
with impaired blood (plasma) glucose levels by using time domain
features.
Awdah et al. [11] studied diabetic subjects with and without
autonomic neuropathy by using the time domain analysis of
HRV. Their results showed significant decrease in all the time
domain measures for diabetic subjects with and without diabetic
neuropathy compared to the control class. In 2005, Flynn et al.
[22] used detrended fluctuation analysis (DFA) to study the HRV
changes over a short time ECG recordings of 20 min. Their study
reported reduced values of HRV for diabetic subjects. Chemla
et al. [16] used autoregressive (AR) methods to study the HRV
spectral components in diabetic patients. They found that diabetic
subjects exhibit decreased spectral values, and that FFT method is

57

more suitable for evaluation of short-term HRV spectral components in diabetic subjects.
Analysis in the time and frequency domain of RR interval has
been carried out by Ahmad Seyd et al. [8], to quantify the autonomic nervous system (ANS) in DM patients. Significant differences in high frequency (HF) power, very low frequency (VLF)
and low frequency (LF) power were noted between DM patients
and normal classes in the frequency domain analysis of extracted
data (NN interval – normal to normal interval). This study also
observed significant difference in time domain analysis of root
mean square of successive NN interval differences (RMSSD) and
the standard deviation of NN interval (SDNN) between the DM
and control groups.
Multiscale entropy (MSE) analysis method has also been used
to diagnose the autonomic dysregulation in DM patients by
Trunkvalterova et al. [58]. Their study performed the analysis of
heart rate (HR) signal, systolic and diastolic blood pressure (SBP
and DBP) signals in both normal and diabetic subjects, to evaluate
the SampEn and linear measures. They reported that in young
patients with DM, the changes in cardiovascular control were
detected by the MSE analysis of SBP and DBP oscillations and HR
signals. The relationship between HRV and duration of type 2 diabetes based on sex-differences was studied by Nolan et al. [38]. By
this study result, an inverse relationship was reported between the
Type 1 and Type 2 diabetes duration and HRV measures among
male subjects only. The inverse association of HRV with increasing
age of diabetes diagnosis, as well as increasing severity of coronary
heart disease risk and obesity was observed in female subjects.
Then in 2012, Faust et al. [20] used time and frequency domain
and nonlinear methods to study the HRV signals of both diabetic
and normal subjects; they have proposed unique ranges for various
features of the two classes. The HRV parameter in diabetic and
non-diabetic patients with renal transplantation has been investigated in time and frequency domain by Kirvela et al. [31]; their
result highlighted that in end-stage diabetic neuropathy patients
the autonomic neuropathy is the main reason to cause severe
impairment of HRV and partly by the co-existing heart disease.
Recently, a novel Diabetic Integrated Index (DII) has been developed by Acharya et al. [3], by using nonlinear parameters extracted
from the HRV signal. This DII is a number which can distinguish
and classify the two classes in terms of just one number. They also
reported that the AdaBoost classifier yielded a high classification
accuracy of 86% for the two classes (normal and diabetic). In this
research group, Swapna et al. [54] used Higher Order Spectral features to classify diabetic patients from normal subjects; their
method reported the highest accuracy, sensitivity and specificity
of 90.5%, 85.7% and 95.2% respectively, by using Gaussian mixture
model classifier. The magnitude plots of the HOS bispectrum
obtained from HRV signals have been subjected to principal
component analysis for feature reduction [28]. These principal
components with SVM classifier reported an accuracy of 79.93%.
However, Acharya et al. [5] reported 90% of accuracy, 92.5% of sensitivity and 88.7% of specificity with AdaBoost classifier coupled
with four nonlinear features. Pachori et al. [42] (In press), proposed
a new nonlinear method based on Empirical Mode Decomposition
(EMD) to discriminate between normal and diabetic RR-interval
signals. In their proposed method, EMD decomposes the RR-interval
signal into IMFs from which five features (Fourier–Bessel series
expansion, amplitude modulation bandwidth, frequency modulation bandwidths, analytic signal representation and second order
difference Plot) are extracted. The study results show that the
features extracted exhibits are statistically significant difference
between normal and diabetic classes.
In our present work, in order to automatically diagnose and
detect DM, we have performed DWT decomposition up to 5 levels
and have extracted the energy, sample entropy, approximation

58

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

entropy, kurtosis and skewness features at various detailed coefficient levels of the DWT. Fig. 1 shows an overview of our proposed
methodology for diabetic HR signal classification. In the off-line
system, normal and diabetes RR signal data are analyzed by
DWT, performed up to 5 level of decomposition. Energy, sample
entropy, approximate entropy, kurtosis and skewness features
are extracted from each levels of the detailed coefficients of the
DWT. Then, these features are ranked by using Bhattacharyya
space algorithm, t-test, Wilcoxon test, Receiver Operating Curve
(ROC), and entropy method. The ranked features are fed to DT,
KNN, NBC and SVM classifiers to obtain the highest classification
performance using minimum number of features. In the on-line
system, up to five levels decomposition are performed by using
DWT method and the features (energy, ApEn, SampEn, kurtosis,
and skewness) are extracted. These features are ranked and fed
to the selected classifiers for automated classification as normal
and DM.
The flow of the paper is as follows. Section 2 delineates (i) the
data acquisition process and pre-processing, (ii) feature extraction
method and feature ranking methods, and (iii) classification. The
results of this novel diagnostic system are presented in Section 3.
The discussion of the results is carried out in Section 4, and conclusion is provided in Section 5.
2. Methods for HRV analysis
2.1. Data acquisition/pre-processing
The electrocardiogram signals (ECG) were acquired from 30
subjects (15 subjects with DM and 15 healthy subjects) in a relaxed
supine position for 60 min. The ECG recordings were performed by
using BIOPAC™ (Aero Camino Goleta, CA, USA) equipment, and the
AcqKnowledge software inbuilt within equipment to convert the
recordings into heart rate time series. Fig. 2 shows the RR signals
of normal and DM patients. We have kept the ECG sampling rate
to 500 Hz. A total of 81 datasets from 15 diabetic subjects (10 male
and 5 female) and 82 datasets from 15 normal subjects (8 male and

7 female) were used in this study, with each dataset having 1000
samples. All the subjects were instructed about the aim of the
study and signed an informed consent before being examined.
The study received the approval by the Kasturba Medical
Hospital, in Manipal, India. Band reject filter with a center frequency of 50 Hz was used to remove the power-line interference noise.
RR points were detected using Pan and Tomkins algorithm [40].

2.2. Feature extraction
Feature extraction step is the crucial process in biomedical signal analysis and interpretation. We have performed DWT on the
HR signals up to five levels, and extracted features of Energy (E),
Approximate Entropy (ApEn), Sample Entropy (SampEn), Kurtosis
(Kur) and Skewness (Skw) from these different levels of DWT
coefficients. The DWT method and the features extracted are
described briefly in the following section.

2.2.1. Discrete Wavelet Transform (DWT)
The DWT transforms the signal from time domain to wavelet
domain and delivers different coefficient values. In the DWT, the
given heart rate signals are passed through high pass and low pass
filter. Once filtering is done, half of the samples are eliminated as it
is sub-sampled by 2. This is the first level of decomposition. Then
the low pass filter coefficients are subjected to low pass and high
pass filter again and this procedure is repeated for different levels
of decomposition. At each level, the number of samples and
frequency band are halved [55]. This converts a signal into low
pass (approximate) coefficients and high pass (detailed) coefficients.
In this work, we have used db8 mother wavelet function [17]. We
performed DWT on HR signals up to five levels, and then extracted
features of energy, ApEn, SampEn, kurtosis, and skewness.
In this work, A5 is the fifth level of the approximate coefficients
and D1–D5 correspond to first to fifth level detailed coefficients.
Fig. 3 shows the DWT performed on RR interval signals of normal
and DM patients.

Fig. 1. Proposed system.

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

59

Fig. 2. Typical RR interval signals: (a) normal subject and (b) diabetic subject.

2.2.2. Energy (E)
It is the square of the DWT coefficient of the heart rate signal.
2.2.3. Approximate Entropy (ApEn)
It is a method used to quantify the amount of regularity and
unpredictability of signal variations [43]. This regularity statistic
has potential application in ECG and heart rate data analysis/time
series [44]. A signal varying rhythmically has small ApEn and vice
versa. Herein, we have used the ApEn formula proposed by Pincus
et al. [45].
2.2.4. Sample Entropy (SampEn)
It is a modification of approximate entropy used for the assessment of complexity and regularity of physiological time-series
[57]. Unlike ApEn, SampEn is independent of data length and
performs consistently well. A signal with more repeating patterns
will have small SampEn and vice versa.
2.2.5. Kurtosis and skewness
These two values are used to assess the probability distributions of the signal series [46]. Kurtosis indicates whether the data
is peak or flat relative to the normal distribution. Skewness
measures the asymmetry of the tails of distribution. The kurtosis
(Kur) and skewness (Skw) are defined as

E½fX  lg4 

2.3. Feature ranking
Ranking methods are one of the fastest methods in feature
selection problem. Feature ranking is used to select a subset of
features, which will reduce the classifiers complication without
making any difference in its performance. In our work, the different feature ranking methods namely, Bhattacharyya space
algorithm, t-test, Wilcoxon test, Receiver Operating Curve (ROC),
and entropy are used to rank the significant features. These feature
ranking methods are briefly explained below.
2.3.1. Bhattacharyya method
In this method, the features are ranked according to their ability
in discriminating the training data. Bhattacharyya ranking method
yields a single evaluation route, to thereby reduce the number of
classifications by adding every feature [29].
2.3.2. t-test
The student t-test method is used to determine whether the
mean of two sets are different or not [13]. The test gives the
p-value and t-values for the features extracted for the two groups
of data. Statistically, a low p-value is preferred (p < 0.05), and higher
the t-value better the ranking. Hence in this work, the low p-value
features are selected and the t-values are used to rank them.

ð2Þ

2.3.3. Wilcoxon test
It assess the difference between the two related samples. This is
a paired test that is suitable for comparing two different measurement sets made on the same data [61].

where X is the probability distribution of the signal, l is the mean
value of the data set, and r represents the standard deviation of
the data set.

2.3.4. Receiver Operating Curve (ROC) method
In this method, the sensitivity and specificity of a diagnostic test
is evaluated to obtain the ROC curve at different threshold values,

Kur ¼

Skw ¼

r4
E½fX  lg3 

r3

ð1Þ

60

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

Fig. 3. Typical DWT plots of RR interval signals: (a) normal and (b) diabetic subject.

and it is plotted as sensitivity versus 1-specificity. A test that perfectly discriminates between the two groups would yield a curve;
then, by determining the area under curve, the soundness of a test
can be assessed. In practice, the area varies between 0.5 and 1; if
the area is closer to 1, the test is considered better; the test is considered worst if the area is closer to 0.5 [39].
2.3.5. Entropy based test
This method is based on the fact that entropy is lower for orderly layout and higher for disorderly layout. In this method, the features are ranked in descending order of relevance, by finding the
descending order of the entropies after removing each feature
one at a time [18].

average value of the ten folds. The different classifiers used in
our study are explained below.
2.4.1. Decision Tree (DT)
This classifier uses the significant features from the training
data to construct a tree [33]. The two classes are defined by using
the rules extracted from the constructed tree. Then the class of the
test data is determined using these rules. The main advantage of
this classifier is its ability to break down a complex decisionmaking process into a collection of simpler decisions, thereby
providing a solution which is often easier to interpret. There may
be difficulties involved in designing an optimal DT classifier. The
performance of a DT classifier strongly depends on how well the
tree is designed.

2.4. Classification
In our work, we have used ten-fold cross validation method to
evaluate the classifiers [2]. Our main objective is to obtain the best
classification accuracy, by using the minimum number of ranked
features and identify the best classifier. In this method, the whole
set of ranked features are first divided into 10 equal parts, with the
first 9 parts (147 data files) being used for training the classifier,
followed by using the trained classifier on the one remaining part
(16 data files) to evaluate its performance. This whole process is
repeated 10 times by taking different parts for training and testing
dataset. The classifier performance is measured by using the

2.4.2. K-Nearest Neighbor (KNN)
It is a simple classifier that determines the k-nearest neighbors
by using the minimum distance from the testing and training data
[32]. The most common among the k-nearest neighbors are
assigned with a class. This classifier has poor run-time performance when the training set is large. In this work, we have used
k = 3.
2.4.3. Naive Bayes Classifier (NBC)
It is a probabilistic classifier which works on the principle of
Bayes theorem, and on the assumption that the features are

61

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64
Table 1
Range (Mean ± Standard Deviation) of features extracted from normal and diabetic RR interval signals.
Features

ApEn_D1
Kur_D3
SamEn_A5
Kur_D2
Kur_D1
ApEn_D3
ApEn_D2
Kur_D4
ApEn_A5
SampEn_D1
Skw_D2
ApEn_D4
E_D1
Skw_A5
ApEn_D5
E_A5
E_D5
Skw_D1
E_D4
E_D2
Skw_D3
E_D3
Skw_D4
Skw_D5
Kur_D5
Kur_A5

Normal

Diabetes

Mean

SD

Mean

SD

0.878684
0.050443
0.414618
0.027187
0.021703
0.850749
0.859135
0.097679
0.665725
0.712377
0.397354
0.731534
0.002803
0.503106
0.667284
7.58E05
0.000253
0.636675
0.000343
0.002461
0.424324
0.001619
0.391949
0.436783
0.193536
0.145777

0.09963
0.049304
0.141522
0.042692
0.052228
0.084797
0.108898
0.090308
0.172749
0.113272
0.039024
0.147349
0.003401
0.121693
0.158245
2.93E05
0.000313
0.052334
0.00035
0.002364
0.042139
0.001354
0.066628
0.188226
0.179643
0.130321

0.766539
0.14026
0.348454
0.098319
0.079683
0.785439
0.786305
0.150161
0.609154
0.646319
0.428037
0.688021
0.019003
0.528192
0.637826
0.012382
0.01238
0.620989
0.012399
0.014506
0.438184
0.01256
0.380989
0.455444
0.204018
0.151545

0.226049
0.205666
0.106587
0.19369
0.157082
0.168723
0.214654
0.166174
0.147757
0.251198
0.138865
0.191352
0.113083
0.143809
0.183521
0.111107
0.111107
0.133937
0.111105
0.111103
0.121263
0.111088
0.12243
0.163141
0.163548
0.177971

independent random variables [24]. The main advantage of this
classifier is that it requires a small amount of training data to estimate the parameters (means and variances of the variables)
required for classification. The most important downside of this
classifier is that it has strong feature independence assumptions.
2.4.4. Support Vector Machine (SVM)
It is one of the most widely used classifiers, which constructs a
separating hyper-plane in a feature space which separates the
training data into two classes [19]. Kernel functions are used, if
the data used are nonlinearly separable, to map the original input
data to a higher dimensional feature space where the features
might become linearly separable. This work concerns polynomial
kernel functions of order 1, 2 and 3 and radial basis function
(RBF) kernels. We have used Least Square SVM (LS-SVM) in this
work [53]. The biggest advantage of SVM is to overcome the curse
of dimensionality in traditional machine learning and local
minima. When dealing with small sample size problem, the generalization ability of this classifier is the best. The biggest limitation
of the SVM lies in the choice of the kernel, and the most serious one
from a practical point of view is the high algorithmic complexity
and extensive memory requirements.
3. Results
In our work, we have extracted a total of twenty-six features
from HRV signals by using the DWT method. Table 1 shows the

p-value

t-value

6.37E05
0.000174
0.000946
0.00142
0.001826
0.002089
0.006906
0.013084
0.026095
0.031581
0.055938
0.105526
0.196578
0.23085
0.273877
0.317341
0.324434
0.325119
0.327228
0.32778
0.330019
0.37381
0.478117
0.499988
0.697491
0.813513

4.106849
3.8445
3.368416
3.246782
3.169823
3.128092
2.736556
2.509336
2.245531
2.168592
1.92543
1.627786
1.296734
1.202721
1.097925
1.003052
0.988411
0.987009
0.982703
0.981578
0.977032
0.891839
0.710994
0.676036
0.389405
0.236283

results of statistical analysis. The results of automated detection
and classification of HRV signals of DM subjects are tabulated in
Table 2. A ten-fold cross validation has been performed on the
ranked features by using different ranking methods which resulted
in an average accuracy of 92.02%, sensitivity of 92.59% and
specificity of 91.46% is shown in Table 2.
Fig. 4 shows the plot of accuracy versus number of features for
various ranking methods. It clearly shows that the t-test method
yields the highest classification accuracy for 21 ranked features,
beyond which there is a drop in the accuracy level. Fig. 5 shows
the plot of average accuracy (%), sensitivity (%) and specificity (%)
versus different folds of ten-fold cross-validation for DT classifier.
It can be noted from Table 1 that all the entropies in the different levels of detailed coefficients have decreased for the diabetic
class due to decrease in the variability. Also, the kurtosis, skewness
and energy of the detailed coefficients have higher value for
diabetic than the normal class.
4. Discussion
In our work, we have developed an automated DM diagnostic
system by extracting the energy and entropy features of the first
five levels of detailed coefficients of DWT. Table 3 provides a summary of these works to discriminate DM automatically by using
HRV analysis to detect diabetes.
Our results show that the entropy features (namely the variables ApEn and SampEn of Table 1) are always statistically lower

Table 2
Results of classification by using various classifiers (features ranked using t-test method).
Classifiers

Features

TP

TN

FP

FN

Sensitivity (%)

Specificity (%)

Accuracy (%)

DT
KNN
NBC
SVM Polynomial 1
SVM Polynomial 2
SVM Polynomial 3

8
5
13
4
6
6

75
74
24
57
67
75

76
76
78
78
72
67

6
6
4
4
10
15

6
7
57
24
14
6

92.59
91.36
29.63
70.37
82.72
92.59

92.68
92.68
95.12
95.12
87.80
81.71

92.64
92.02
62.58
82.82
85.28
87.12

62

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

Table 3
Studies conducted to discriminate normal and diabetic subjects using HRV signals.
Authors

Methods

Features

Classifier,
number of
features

Performance

Pfeifer et al. [41]

Time domain

RR variations

Nil, one

Singh et al. [52]

Time and frequency
domain
Time domain
Time, Frequency
domain and nonlinear

SDNN, high and Low Frequency (LF) power, LF/HF

Nil, four

SDRR, NN50, RMSSD, pNN50%, etc
All features

Nil, Time

Supine HRV during a beta-adrenergic
blockade and deep respiratory rate can
effectively estimate parasympathetic
nervous activity in diabetic and control
subjects
LF power and LF/HF ratio were lower in
DM
Decreased with diabetes
Decreased with diabetes

Awdah et al. [11]
Faust et al. [20]

Nil, eight
domain:
seven
Freq domain:
three

Kirvela et al. [31]

Time and frequency
domain

All time domain and frequency domain features

Flynn et al. [22]

Detrended fluctuation
analysis
FFT and Autoregressive
spectral analysis
Time domain

Short range correlation (a1)

Nonlinear:
twenty-two
Nil, All time
and
frequency
domain
features
Nil, One

LF/HF ratio, LF(nu), and HF(nu)

Nil, Three

Short range correlation (a1) decreases for
diabetes subjects
Decreased value for diabetes subjects

SD, root mean square of successive differences in
normal-to-normal R-R intervals
All time domain and frequency domain features

Nil, Three

Decreased value for diabetes subjects

Nil, Time

All parameters reduced with diabetes

Chemla et al. [16]
Schroeder et al. [49]
Ahamed Seyd et al. [8]

Time and frequency
domain

domain: nine
Freq domain:
eleven
Nil, MSE

Trunkvalterova et al.
[58]
Nolan et al. [38]

Nonlinear

Multiscale entropy (MSE)

Time and frequency
domain

Acharya et al. [3]

Nonlinear

High frequency (HF) power, root mean square of
successive differences between R–R intervals, total
R–R variability
RQA features, Correlation dimension, long term
variability

Swapna et al. [54]

HOS

Jian et al. [28]
Acharya et al. [5]

HOS
Nonlinear

Pachori et al. [42]
(in press)

Nonlinear

This work

DWT

Diminished HRV has been observed in
diabetic autonomic neuropathy

Bispectrum moments, entropies and weighted
centers
PCA features
RQA features, ApEn

Mean frequency using Fourier–Bessel series
expansion, two bandwidth parameters (amplitude
modulation and frequency modulation bandwidths)
and Analytic Signal Representation (ASR) and
Second Order Difference Plot (SODP)
Entropies, energy, skewness and kurtosis

for DM as compared to controls. This is in accordance to some very
recent studies that showed how diabetes reduced the entropy of
the EMG signals [59] and of the near-infrared signals measuring
muscle metabolism [36]. This decreased signal entropy is found
both in the electrical activation of the muscles and suggested that
during metabolism diabetes might alter the muscle fiber conduction velocity and membrane functioning. We believe the entropy
of the signal is a very important parameter also when analyzing
the HRV signal, because it might directly reflect a neuromuscular
effect of DM.
This
newly
developed
system
has
the
following
advantages:

Nil, Three

PerceptronAdaBoost,
five

GMM, eight
SVM
Least
SquaresAdaBoost,
four
Kruskal–
Wallis
statistical
test, five
DT, eight

MSE was significantly reduced on scales 2
and 3 in DM
Between HRV measures, duration of Type
1 and Type 2 diabetes relationship is
inverse
Diabetes Index, Accuracy: 86%

Sensitivity: 87.5%
Specificity: 84.6%
Accuracy: 90.5% Sensitivity: 85.7%
Specificity: 95.2%
Accuracy: 79.93%
Accuracy: 90.0% Sensitivity: 92.5%
Specificity: 88.7%

Features provide statistically significant
difference between diabetic and normal
classes

Accuracy: 92.02%
Sensitivity: 92.59%
Specificity: 91.46%

(a) The developed software is repeatable and not prone to any
inter/intra-observer variability.
(b) This diagnostic tool will eliminate the need of repeated tests
to confirm the DM, and thereby provide more reliable and
faster diagnosis.
(c) This method is highly effective during the situation when lot
of data are to be collected for long durations to understand
and identify the abnormality.
(d) Our method performed better than the rest of the techniques
reported in the above table.
(e) The proposed system is robust (ten-fold stratified crossvalidation) and reduces the burden on the clinicians.

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

63

Fig. 4. Plot of accuracy versus number of features for the various ranking methods.

Fig. 5. Plot of average accuracy (%), sensitivity (%) and specificity (%) versus different folds of ten-fold cross-validation for DT classifier.

5. Conclusion
[4]

Diabetes is identified as one of the rapidly growing health concern in rural and urban cities of developed and developing countries. Earlier intervention and continued treatment helps to keep
the diabetes under control. In this work, we have provided a tutorial on how diabetes is associated with cardiovascular autonomic
neuropathy, which affects HRV. Hence, we can detect diabetes by
carrying out HRV spectral analysis. We have presented an automated DM detection system, by using DWT features (of energy
and entropy) extracted from the HRV signals. Using our presented
method, we have obtained the accuracy, sensitivity and specificity
of 92.02%, 92.59% and 91.46% respectively by using DT classifier.
The proposed method can be further extended to develop a CAD
system which can assist the clinicians to screen the diabetes
subjects.

[5]

[6]
[7]
[8]

[9]

[10]
[11]

[12]

References
[13]
[1] I.V. Aaron, E.M. Raelene, D.M. Braxton, R. Roy, Diabetic autonomic neuropathy,
Diabetes Care 26 (2003) 1553–1579.
[2] U.R. Acharya, S. Vinitha Sree, C.A. Ang Peng, S.S. Jasjit, Use of principal
component analysis for automatic classification of epileptic EEG activities in
wavelet framework, Expert Syst. Appl. 39 (2012) 9072–9078.
[3] U.R. Acharya, O. Faust, S. Vinitha Sree, D.N. Ghista, S. Dua, P. Joseph, A.V.I.
Thajudin, N. Janarthanan, T. Tamura, An integrated diabetic index using heart

[14]
[15]

rate variability signal features for diagnosis of diabetes, Comput. Method
Biomech. Biomed. Eng. 16 (2013) 222–234.
U.R. Acharya, N. Kannathal, S.M. Krishna, Comprehensive analysis of cardiac
health using heart rate signals, Physiol. Meas. 25 (2004) 1139–1151.
U.R. Acharya, O. Faust, N.A. Kadri, J.S. Suri, W. Yu, Automated identification of
normal and diabetes heart rate signals using nonlinear measures, Comput. Biol.
Med. 43 (10) (2013) 1523–1529.
U.R. Acharya, K.P. Joseph, N. Kannathal, M.L. Choo, J.S. Suri, Heart rate
variability: a review, Med. Biol. Eng. Comput. 44 (2006) 1031–1051.
American Diabetes Association (ADA) Fast Facts Data and statistics about
diabetes, 2013.
P.T. Ahamed Seyd, T.V.I. Ahamed, J. Jeevamma, P.K. Jospeh, Time and frequency
domain analysis of heart rate variability and their correlations in diabetes
mellitus, World Acad. Sci., Eng. Technol. 2 (2008) 583–586.
S. Akselrod, D. Gordon, J.B. Madwed, D.C. Snidman, R.J. Cohen, Hemodynamic
regulation: investigation by spectral analysis, Am. J. Physiol. 249 (1985) 867–
875.
American diabetes association (ADA), Diagnosis and classification of diabetes
mellitus. Diabetes Care, vol. 27, 2004.
A. Awdah, A. Nabil, S. Ahmad, Q. Reem, A. Khidir, Time-domain analysis of
heart rate variability in diabetic patients with and without autonomic
neuropathy, Ann. Saudi Med. 22 (2002) 5–6.
P. Aurelien, R. Manuel, A.J. Sophie, B. Claire De, Anre Denjean, Spectral analysis
of heart rate variability interchangeability between autoregressive analysis
and fast Fourier transform, J. Electrocardiol. 39 (2006) 31–37.
J.F. Box, Guinness, gosset, fisher, and small samples, Statist. Sci. 2 (1987) 45–
52.
Roy Bhaskar, G. Sobhendu, Nonlinear methods to assess changes in heart rate
variability in type 2 diabetic patients, Arq. Bras. Cardiol. (2013).
S. Cerutti, A. Bianchi, B. Bontempi, G. Comi, Power spectrum analysis of heart
rate variability signal in the diagnosis of diabetic neuropathy, Proceedings of
the annual international conference of the IEEE engineering in medicine and
biology society 1 (1989) 12–13.

64

U. Rajendra Acharya et al. / Knowledge-Based Systems 81 (2015) 56–64

[16] D. Chemla, J. Young, F. Badilini, P. Maison-Blanche, H.Y. Affres, Lecarpentier P.
Chanson, Comparison of fast Fourier transform and auto-regressive spectral
analysis for the study of heart rate variability in diabetic patients, Int. J.
Cardiol. 104 (3) (2005) 307–313.
[17] G. Donna, U.R. Acharya, J.M. Roshan, S. VinithaSree, T.C. Lim, A.V.I. Thajudin, J.S.
Suri, Automated diagnosis of coronary artery disease affected patients using
LDA, PCA, ICA and discrete wavelet transform, Knowledge Based Syst. 37
(2012) 274–282.
[18] M. Dash, H. Liu, Handling large unsupervised data via dimensionality
reduction. ACM SIGMOD Workshop on Research Issues in Data Mining and
Knowledge Discovery, 1999.
[19] E. Osuna Edgar, F. Robert, G. Federico, Support vector machines: training and
applications, technical report. MIT AI Lab. Centre for Biological and
Computational Learning, March 1997.
[20] O. Faust, U.R. Acharya, F. Molinari, S. Chattopadhyay, T. Tamura, Linear and
non-linear analysis of cardiac health in diabetic subjects, Biomed. Signal
Process. Control 7 (3) (2012) 295–302.
[21] Federico Bellavere, Italo Balzani, Giovanni De Masi, Maurizio Carraro, Pasquale
Carenza, Claudio Cobelli, Karl Thomaseth, Power spectral analysis of heart rate
variations improves assessment of diabetic cardiac autonomic neuropathy,
Diabetes 41 (1992) 633–640.
[22] A.C. Flynn, H.F. Jelinek, M. Smith, Heart rate variability analysis: a useful
assessment tool for diabetes associated cardiac dysfunction in rural and
remote areas, Aust. J. Rural Health 3 (2) (2005) 77–82.
[23] F.J. Herbert, M.I. Hasan, A.A. Hayder, H.K. Ahsan, Association of cardiovascular
risk using nonlinear heart rate variability measures with the Framigham risk
score in a rural population, Front. Physiol., Comput. Physiol. Med. (2013) 4.
[24] J. Han, M. Kamber, J. Pei, Data mining: Concepts and Techniques, Morgan
Kaufmann, Waltham, MA, 2005.
[25] Chu Duc Hoang Chu, Phan Kien Nguyen, Viet Dung Nguyen, A review of heart
rate variability and its applications, APCBEE Proc. 7 (2013) 80–85.
[26] International Diabetes Federation Diabetes Atlas, sixth ed., 2013.
[27] Constant Isabelle, Dominique Laude, Isabelle Murat, Jean-Luc Elghozi, Pulse
rate variability is not a surrogate for heart rate variability, Clin. Sci. 97 (1999)
391–397.
[28] Wei Jian Lee, Cheng Lim Teik, Automated detection of diabetes by means of
higher order spectral features obtained from heart rate signals, J. Med. Imaging
Health Inform. 3 (2013) 440–447.
[29] T. Kailath, The divergence and Bhattacharyya distance measures in signal
selection, IEEE Trans. Commun. Technol. 15 (1) (1967) 52–60.
[30] G. Kheder, A. Kachouri, R. Taleb, M.M. Ben, M. Samet, Feature extraction by
wavelet transforms to analyse the heart rate variability during two meditation
technique. 6th WSEAS International conference on Circuits, Systems,
Electronics, Control and Signal Processing, 2007.
[31] M. Kirvela, K. Salmela, L. Toivonen, A.M. Koivusalo, L. Lindgren, Heartrate
variability in diabetic and non-diabetic renal transplant patients, Acta
Anaesthesiol. Scand. 40 (7) (1996) 804–808.
[32] D.T. Larose, Discovering Knowledge in Data: An Introduction to Data Mining,
KNN, Willey Interscience, New Jersey, USA, 2004, pp. 90–106 (Chapter 5).
[33] D.T. Larose, Decision trees, Chapter 6 in discovering knowledge in data: an
introduction to data mining, Wiley Interscience, Hoboken, N, 2004, pp. 108–126.
[34] A. Malliani, F. Lombardi, M. Pagani, Power spectrum analysis of heart rate
variability: a tool to explore neural regulatory mechanisms, Brit. Heart J. 71
(1994) 1–2.
[35] A. Metin, Nonlinear Biomedical Signal Processing. Fuzzy Logic, Neural
Networks, and New Algorithms, vol. 1, IEEE Press, Fuzzy logic, 2000.
[36] F. Molinari, U.R. Acharya, R.J. Martis, R. De Luca, G. Petraroli, W. Liboni, Entropy
analysis of muscular near-infrared spectroscopy (NIRS) signals during exercise
programme of type 2 diabetic patients: quantitative assessment of muscle
metabolic pattern, Comp. Method Prog. Biomed. 112 (2013) 518–528.
[37] Karim Nasim, Hasan Jahan Ara, Ali Syed Sanowar, Heart rate variability – a
review, J. Basic Appl. Sci. 7 (2011) 71–77.
[38] R.P. Nolan, S.M. Barry-Bianchi, A.E. Mechetiuc, M.H. Chen, Sex-based
differences in the association between duration of type 2 diabetes, Diab.
Vasc. Dis. Res. 6 (2009) 276–282.

[39] N.A. Obuchowski, Receiver operating characteristic curves and their use in
radiology, Radiology 229 (2003) 3–8.
[40] J. Pan, W.J. Tompkins, A real time QRS detection algorithm, IEEE Trans. Biomed.
Eng. 32 (3) (1985) 230–236.
[41] M.A. Pfeifer, D. Cook, J. Brodsky, D. Tice, A. Reenan, S. Swedine, J.B. Halter, D.
Porte, Quantitative evaluation of cardiac parasympathetic activity in normal
and diabetic man, Diabetes 31 (4) (1982) 339–345.
[42] R.B. Pachori, P. Avinash, K. Shashank, R. Sharma, U.R. Acharya, Application of
empirical mode decomposition for analysis of normal and diabetic RR-interval
signals. Expert Systems with Applications, 2015 (in press).
[43] A.M. Pincus, Approximate entropy as a measure of system complexity, Proc.
Nat. Acad. Sci. 88 (1991) 2297–2301.
[44] S.M. Pincus, I.M. Gladstone, A.E. Richard, A regularity statistic for medical data
analysis, J. Clin. Monit. (1991) 7.
[45] S.M. Pincus, D.L. Keefe, Quantification of hormone pulsatility via an
approximate entropy algorithm, Am. J. Physiol. 262 (1992) E741–E754.
[46] Shi Ping, Hu Sijung, Z. Yisheng, A preliminary attempt to understand
compatibility of photoplethysmographic pulse rate variability with
electrocardiogramic heart rate variability, J. Med. Biol. Eng. 28 (2008) 173–
180.
[47] W. Sarah, S. Richard, R. Gojka, G. Anders, K. Hilary, Global prevalence of
diabetes estimates for the year 2000 and projections for 2030, Diabetes Care
27 (2004) 1047–1053.
[48] Sarika Tale, T.R. Sontakke, Time-frequency analysis of heart rate variability
signal in prognosis of type 2 diabetic autonomic neuropathy, 2011.
International Conference on Biomedical Engineering and Technology, vol. 11,
2011.
[49] E.B. Schroeder, L.E. Chambless, D. Liao, R.J. Prineas, J.W. Evans, W.D. Rosamond,
G. Heiss, Diabetes, glucose, insulin, and heart rate variability: the
atherosclerosis risk in communities (aric) study, Diabetes Care 28 (3) (2005)
668–674.
[50] A. Schumacher, Linear and nonlinear approaches to the analysis of RR interval
variability, Biol. Res. Nursing 5 (2004) 211–221.
[51] B. Pomeranz, R.J.B. Macaulay, M.A. Caudill, I. Kutz, D. Adam, K.M. Kilborn, A.C.
Barger, D.C. Shannon, R.J. Cohen, H. Benson, Assessment of autonomic
function in humans by heart rate spectral analysis, Am. J. Physiol. 248
(1985) 151–153.
[52] J.P. Singh, M.G. Larson, C.J. O’Donnell, P.F. Wilson, H. Tsuji, D.M. Lloyd-Jones, D.
Levy, Association of hyperglycemia with reduced heart rate variability (the
Framingham heart study), Am. J. Cardiol. 86 (3) (2000) 309–312.
[53] J.A.K. Suykens, J. Vandewalle, Least square support vector machine classifiers,
Neural Process. Lett. 9 (1999) 293–300.
[54] G. Swapna, U.R. Acharya, V.S. Sree, J.S. Suri, Automated diagnosis of diabetes
using higher order spectra features extracted from heart rate signals, Intell.
Data Anal. 17 (2) (2013) 309–326.
[55] M. Ratnakar, K.S. Sunil, J. Nitisha, Signal filtering using discrete wavelet
transform, Int. J. Recent Trends Eng. (2009) 2.
[56] J.M. Roshan, U.R. Acharya, C.M. Lim, ECG beat classification using PCA, LDA,
ICA, and discrete wavelet transform, Biomed. Signal Process. Control 8 (2013)
437–448.
[57] J.S. Richman, J.R. Mooran, Physiological time-series analysis using approximate
entropy and sample entropy, Am. J. Physiol. Heart Circphysiol. 278 (2000)
2039–2049.
[58] Z. Trunkvalterova, M. Javorka, I. Tonhajzerova, J. Javorkova, Z. Lazarova, K.
Javorka, M. Baumert, Reduced short-term complexity of heart rate and blood
pressure dynamics in patients with diabetes mellitus type 1: multiscale
entropy analysis, Physiol. Meas. 29 (7) (2008) 817–828.
[59] K. Watanabe, T. Miyamoto, Y. Tanaka, K. Fukuda, T. Moritani, Type 2 diabetes
mellitus patients manifest characteristic spatial EMG potential distribution
pattern during sustained isometric contraction, Diab. Res. Clin. Pract. 97 (3)
(2012) 468–473.
[60] WHO Consultation: definition and diagnosis of diabetes mellitus and
intermediate hyperglycemia, 2006.
[61] F. Wilcoxon, Individual comparisons by ranking methods, Biometric Bull. 1
(1945) 80–83.

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close