Development of Computer Aided Prediction Technology for Paroxysmal Atrial Fibrillation in Mobile Healthcare

Atrial fibrillation (AF) is one of the major health risks for strokes, and has been found to increase the risk of ischemic stroke fivefold (Wolf et al., 1987). Strokes with AF are more severe, cause greater disability and have worse outcomes than those without (Jørgensen et al., 1996; Lin et al., 1996). Approximately 15–20% of all strokes are due to AF but the percentage increases up to 40% for patients over the age of 90 (Marini et al., 2005). As the risk of initial strokes in these patients can be reduced up to 64% by long-term treatment with anti-coagulants (Hart et al., 2007), accurate and timely detection of AF is essential in highrisk elderly populations. Paroxysmal AF (PAF) is defined as an AF episode that lasts longer than 2 min but less than 7 days. Most PAF episodes terminate spontaneously within 24 h. Although PAF is short-lived, it shares similar stroke risks with long-lasting persistent AF (Friberg et al., 2010). Furthermore, the rare and asymptomatic manifestation of AF episodes makes a definitive diagnosis of PAF even harder to provide. Holter or event recorder devices are used conventionally to capture actual episodes of AF. Using 24-h Holter monitoring, only 23% of cases were detected among PAF patients whose diagnosis was confirmed by continuous bedside electrocardiograph (ECG) monitoring (Rizos et al., 2010), which indicates an underestimated detection of PAF leading to inadequate treatment. Using 7-day event-loop recording, 5.7% of cases that were missed by 24-h Holter monitoring were revealed to be PAF (Jabaudon et al., 2004). Thus, prolonged ECG monitoring by event recorders improves the detection rates of PAF cases. However, event recorders may not be appropriate for all patients, and depend on patient compliance. The extended use of Holter monitors is not commonly practiced in clinics, and few technical guidelines are currently available for the improvement of PAF detection by a short-term use of Holter monitors. Prediction methods have been studied in order to suppress PAF episodes through atrial pacing techniques and to provide the clinical benefit of maintaining normal sinus rhythm (NSR) in, for example, drug-refractory AF patients (Delfaut et al., 1998). Most of the methods developed were inspired by earlier physiological findings that were associated with the triggering mechanisms of PAF, namely, activated ectopic nodes in abnormal atrial tissue and the imbalance of the autonomic nervous system (ANS) via increased sympathetic


Introduction
Atrial fibrillation (AF) is one of the major health risks for strokes, and has been found to increase the risk of ischemic stroke fivefold (Wolf et al., 1987).Strokes with AF are more severe, cause greater disability and have worse outcomes than those without (Jørgensen et al., 1996;Lin et al., 1996).Approximately 15-20% of all strokes are due to AF but the percentage increases up to 40% for patients over the age of 90 (Marini et al., 2005).As the risk of initial strokes in these patients can be reduced up to 64% by long-term treatment with anti-coagulants (Hart et al., 2007), accurate and timely detection of AF is essential in highrisk elderly populations.Paroxysmal AF (PAF) is defined as an AF episode that lasts longer than 2 min but less than 7 days.Most PAF episodes terminate spontaneously within 24 h.Although PAF is short-lived, it shares similar stroke risks with long-lasting persistent AF (Friberg et al., 2010).Furthermore, the rare and asymptomatic manifestation of AF episodes makes a definitive diagnosis of PAF even harder to provide.Holter or event recorder devices are used conventionally to capture actual episodes of AF.Using 24-h Holter monitoring, only 23% of cases were detected among PAF patients whose diagnosis was confirmed by continuous bedside electrocardiograph (ECG) monitoring (Rizos et al., 2010), which indicates an underestimated detection of PAF leading to inadequate treatment.Using 7-day event-loop recording, 5.7% of cases that were missed by 24-h Holter monitoring were revealed to be PAF (Jabaudon et al., 2004).Thus, prolonged ECG monitoring by event recorders improves the detection rates of PAF cases.However, event recorders may not be appropriate for all patients, and depend on patient compliance.The extended use of Holter monitors is not commonly practiced in clinics, and few technical guidelines are currently available for the improvement of PAF detection by a short-term use of Holter monitors.Prediction methods have been studied in order to suppress PAF episodes through atrial pacing techniques and to provide the clinical benefit of maintaining normal sinus rhythm (NSR) in, for example, drug-refractory AF patients (Delfaut et al., 1998).Most of the methods developed were inspired by earlier physiological findings that were associated with the triggering mechanisms of PAF, namely, activated ectopic nodes in abnormal atrial tissue and the imbalance of the autonomic nervous system (ANS) via increased sympathetic or parasympathetic tone.A hypothetical transition period has been proposed during which the triggering mechanisms are activated, forcing NSR to change to PAF.During the transition, the number of premature atrial complexes (PAC), runs of atrial bigeminy and trigeminy, and length of paroxysmal atrial tachycardia were increased before PAF onset (Thong et al., 2004).Spectral analysis revealed a significant increase in ANS activity in both the sympathetic and parasympathetic branches, which were represented by increased lowfrequency (LF) and high-frequency (HF) components in the power spectrum density of the heartbeat intervals (RR intervals) (Chesnokov et al., 2008;Kim et al., 2008a).Heart rate variability (HRV) features in time domain analysis were also significantly changed during the transition (Kim et al., 2008a), again indicating a change in ANS activity.The complexity of RR interval dynamics decreased before the onset of PAF episodes (Chesnokov, 2008;Vikman et al., 1999).Poincaré plots depicting the correlation between two successive RR intervals in a two-dimensional diagram were also found to be closely related to ANS activity and to exhibit highly heterogeneous patterns during the transition (Duong et al., 2009).Recently, recurrence plots representing the frequency of two RR intervals in close proximity were investigated to discover dynamic behaviors during the transition that were not uncovered by earlier methods (Mohebbi & Ghassemian, 2011).In these studies, prediction models were derived by running pattern classification algorithms on learning sets that consisted of two types of ECG data: one distant and the other immediately before the onsets, representing NSR and the transition period, respectively (available from the PhysioBank's MIT-BIH AF Prediction Database as 30-min ECG data sets).These prediction models were able to detect the transition to PAF events at rates of 70-97%.Some of PAF prediction reports are summarized in Table 1.Recent reviews of these techniques and their performance are also available (Sahoo et al., 2011;Mohebbi & Ghassemian, 2011).To apply these methods to atrial pacing techniques, ECG data need to be sampled continuously.Due to the low detection rates of PAF events provided by Holter devices, alternative approaches have been motivated by the physiological findings of different dynamics of heart rate controls in PAF subjects, even when AF is not actually occurring.Thus, instead of relying on the capture of PAF episodes, prediction methods have been developed to provide a diagnostic assessment of PAF cases that display either no episodes of AF or a small number of episodes (Hickey et al., 2004;Kikillus et al., 2008;Kim et al., 2008a).Although these prediction models were motivated by the similar findings and methods explained above, they were evaluated against public databases that provide long-term ECG data from PAF and NSR subjects (available from the PhysioBank's MIT-BIH AF and NSR Databases).The AF Database provides 25 ECG data sets of 10-h recordings obtained from patients with PAF.The NSR Database provides 18 ECG data sets of 24-h recordings obtained from healthy persons with no arrhythmias.Numerous approaches have been reported because multiple representations of heart rate dynamics were feasible at different time-points of the day or night.Each short segment was classified by a first-stage classifier who relied on the detection of abnormal HRV features and PAC.To incorporate prediction results made at different time-points into one assessment, a second-stage classifier inquired whether an empirical rule was satisfied.For example, a subject was classified as PAF if at least 10% of the segments were classified as PAF segments and the average probability of PAF was at least 0.35 (Hickey et al., 2004).For this approach, finding the empirical rule was critical to evaluate the performance of the classifier.The performance of this prediction model may be directly related to not only how often a PAF patient experiences a physiological condition resembling the transition period, but also how often actual PAF episodes occur immediately after the presumed transition period.However, no information is available on the frequency of a transition period being followed by an actual PAF event.Furthermore, the performance of these prediction models over a long-term monitoring period has not yet been investigated.In contrast, if the sampling of ECG data as learning sets is not confined to the transition period, then a prediction model may be less sensitive to whether a subject experiences the transition period or not and still performs as anticipated.Instead of confining learning set samples to the transition period, 1-h ECG segments were sampled from all available time-points and evaluated by a classifier based on a risk assessment of HRV analysis and Poincaré plot analysis performed on all samples (Kikillus et al., 2008).Because the test sets included the AF episodes, the classification performance was sensitive to the length of AF episodes included in the data.Alternatively, multiple ultra-short 3-min segments were sampled from different time periods of the day and classified using two different formulas designed for day or evening time (Kim et al., 2008a(Kim et al., , 2008b)).This method did not contain AF episodes in the test data, and thus the prediction accuracy relates exclusively to detecting the transition period.In addition, transition periods were treated differently depending on the time of their occurrence (Kim et al., 2008b), which resulted in two time-dependent classifiers performing better than one classifier disregarding the occurrence time of transitions.These methods require ECG data to be continuously analyzed, that may demand some extraordinary functional capabilities from wireless devices and data networks in mobile healthcare systems.For instance, a limited battery time is common in potable wireless gateway devices such as personal data assistant (PDA) or cellular phones.Network congestion and loss should be avoided or reliably handled in a medical sensor network (MSN) (Hu & Xiao, 2009) that may monitors large number of mobile subjects.As a strategy that alleviates these requirements of future mobile healthcare devices and networks, an intermittent data sampling and its relevance to clinical decision makings could be considered.Thus, we propose that intermittently sampled ECG data (but devoid of any arrhythmic episodes) may be used as learning sets for calculating a PAF prediction model.Our driving hypothesis remains the same as that stated in previous reports: the non-episodic state of PAF subjects is different to that of NSR subjects (Kikillus et al., 2008;Kim et al., 2008a).The recurrence rate of silent AF is high even when the patient has undergone apparently successful ablation or drug therapy.Patients who had been treated with atrial ablation and circumferential pulmonary vein ablation were reported to have a 26.7% recurrence rate after 13 months and a rate of 31% after 19 months, respectively (Berruezo et al., 2007;Grubitzsch et al., 2008).Patients who underwent chemical or electrical cardioversion also showed high recurrence rates of 30-43% after 0.5-42 months (Aytemir et al., 1999;Lombardi et al., 2001).The recurrence of silent AF in stroke victims can also be as high as the percentage reported in these earlier studies.If clinicians and patients can be alarmed by the early diagnosis of AF to take appropriate measures and avoid the recurrence of ischemic strokes, tens of thousands of recurrent stroke victims can be saved each year.In this study, we first sought to develop a detection method for PAF subjects using HRV patterns obtained from intermittently sampled ECG.Based on this model, we then sought to predict recurrent PAF subjects among patients who had been successfully treated for AF previously and were currently under anti-arrhythmic medication.In addition, we have implemented our algorithms to a remote real-time heartbeat analysis system that consists of a portable ECG sensor with a three-axis accelerometer, a smart phone, and a data analysis server.This internet-based system should provide a developmental platform to investigate the detection of PAF further through the use of long-term monitoring.

Subjects and data acquisition
The 24-h ECG data of 50 cases were obtained from the archive storage in Chungnam National University Hospital, Department of Cardiology.Subjects visited the clinic due for a variety of complaints with symptoms that might have been related to underlying cardiac diseases.Data were included if the subject was older than 35 years, had visited the clinic during the past 2 years, and were free of any cardiovascular disease.All ECG data were obtained by using the same type of Holter monitors for 24 h.The use of patient ECG data was approved by the internal review board of the hospital.Thirty-nine patients with previously diagnosed AF were recruited for 48-h Holter monitoring in the same department.Patients completed the consent form for the use of experimental data and a stress questionnaire (Koh et al., 2001).Six patients were excluded due to the failure of 48h Holter monitoring (n = 3) and the loss of ECG data during data handling (n = 3).A total of 33 patients had been under AF management for the past 1.5 years on average (15 women: age range, 39-82 years, median age, 66 years; and 18 men: age range, 43-78 years, median age, 66 years).This experiment was approved by the internal review board of the hospital.

Data processing 2.2.1 Preparation of ECG data
ECG signals were sampled at 125 samples per second by Holter recorders (Marquette MARS PC Holter monitor, GE Healthcare) and screened for arrhythmic events first by the computer software and then by two cardiologists.A 24-h binary record (in .binformat) was transformed into four 6-h text-based records using a software tool called "rdsamp.exe"available at the Physionet.Each record represented four different time periods of the day (namely morning from 6 am to 12 noon, afternoon from noon to 6 pm, evening from 6 pm to 12 midnight, and night from midnight to 6 am).ECG segments that contained arrhythmic episodes were also removed based on medical records.Noisy parts of ECG data were automatically removed by a cut-off value that represented the maximum value of local heterogeneity of ECG signals (data not shown).The percentage of ECG signals detected as noise was also recorded.

RR interval detection
Time intervals between two successive QRS complexes (RR intervals) were obtained using the previous ECG analysis software for a mobile application (Salahuddin & Kim, 2006).Java-based analysis software was developed to detect RR intervals and calculate the HRV features.The analysis algorithm was mainly adapted from a derivative method (Pan & Tompkins, 1985) but several modifications had to be added to ensure the detection of true RR intervals in noisy ECG signals.Briefly, a tophat operation was performed on the band-pass waveforms to suppress noisy peaks and remove the baseline shift.A set of rules was applied to avoid detecting R peaks from unusually high or low amplitude parts and noisy parts of ECG signals.
After detecting R peaks, another set of rules was applied to avoid extracting RR intervals that were too short or too long compared with average RR intervals calculated from accumulated RR intervals.The analysis algorithm was divided into three phases: a pre-processing, an R peak decision and a post-processing.During the pre-processing phase, the ECG signal (Figure 1a) was passed through a band-pass filter -a 60th order finite impulse response digital filter using a Hamming window -with a cut-off frequency ranging from 8 to 12 Hz (Figure 1b).The band-passed signal was then filtered through a tophat operation that consisted of a series of minimum and maximum operations and a subtraction operation (Figure 1c).Since the size of the tophat filter was set to equal the average width at half maximum of the R peaks (seven time-points), tophat filtering preferentially enhanced RR intervals and suppressed noisy small peaks and large T peaks.The subtraction of the minimum-maximum filtered waveforms from the original waveforms eliminated baseline drifts completely.The tophat-processed ECG signal was differentiated so that RR interval signals resulted in two wave peaks with relatively larger amplitudes (Figure 1d).The resultant signal was squared to amplify the part of the signal with larger amplitude to a greater extent (Figure 1e).An integral waveform was generated by a moving window of 30 time-points in width (Figure 1f).The integral waveform was compared with the adaptive thresholds that were continuously updated estimates of the peak signal level (Figure 1g) and the peak noise level (data not shown) (Pan & Tompkins, 1985).In our algorithm, the search back (or dual thresholds) technique (Pan & Tompkins, 1985) was not used since it did not seem to contribute significantly to the detection of the RR intervals of our ECG signals.The adaptive threshold produced a series of time-point ranges (rectangular wave tops), representing possible locations of R peak candidates (Figure 1h).The local maximum of band-passed waveform within each range was found to be an R peak candidate (shown as arrowheads in Figure 1i).During the R peak decision phase, a true R peak was found if the following two conditions were satisfied at the time-point of the R peak candidate: 1) if the amplitude of the raw signal was less than 1000; and 2) if the amplitude of tophat-filtered signal was greater than 30.During the post-processing phase, some RR intervals were often found to be unreasonably short or long due to errors such as detecting spurious peaks or missing true peaks.An RR interval was considered as valid if it satisfied the following two conditions: 1) if the RR interval was longer than 0.75 times; and 2) if the RR interval was shorter than 1.5 times the accumulated RR interval average to avoid unusually short or long RR intervals caused by noisy signals or missing R peaks, respectively.

HRV calculation
From each 6-h RR interval record, four 30-min RR records were randomly selected.The baseline trend in heart rates introduced by postural change or movement was removed using a linear curve-fitting method.Detrended time series were cubically interpolated and re-sampled at 4 Hz, and the fast Fourier transform was windowed with 256-sample-width Hamming windows with 50% overlap.All HRV features were calculated from the detrended RR interval series.
In each RR interval set, the following HRV features were calculated using the ECG analysis software (also available at http://mhealth.kaist.ac.kr/afdectection): mean heart rate (mean HR), mean heartbeat intervals (mean RR), standard deviation of NN interval (SDNN), coefficient of variation (CV), root mean square of successive differences (RMSSD), and percentage heartbeat intervals with difference in successive heartbeat intervals greater than 50 ms (PNN50) as time domain features; HRV index (bin width of 8.0 ms), triangular interpolation of heartbeat interval histogram (TINN), and stress index (SI) (Lednev et al., 2008) as geometrical analysis features; and LF (LF 0.04-0.09Hz), HF (HF >0.1 Hz), the ratio of LF to HF (LF/HF), normalized LF (LFnu) and normalized HF (HFnu) components as frequency domain features (Task Force, 1996).Among the HRV features, the RMSSD is worthy of further mention because it has been used to indicate the levels of mental or physical stress of a subject (Task Force, 1996).In preparing the learning sets, RMSSD was used as basis to estimate the activity level of a subject during daily activity.

Poincaré plot pattern analysis
The Poincaré plot is a two-dimensional scatter plot of each heartbeat interval plotted against the subsequent interval, and thus depicts the correlation between successive heartbeat intervals (Woo et al., 1992).Recently, Poincaré plots of arrhythmic ECG data were systematically investigated to discover 10 distinctive prototypical patterns that represent different kinds of arrhythmias from 24-h Holter ECG data (Esperer et al., 2008).For example, fan-shaped Poincaré plots were typical in subjects with AF; multiple side lobe patterns specified the presence of atrial premature beats or ventricular premature beats; while an island pattern was highly correlated with atrial flutter or atrial tachycardia (Esperer et al., 2008).Poincaré plots were generated to calculate Poincaré plot features and classify them into different patterns such as torped, island, multiple side lobes, and fan pattern (Duong et al., 2009).Finally, standard deviations of the minor axis (SD 1 ) and major axis (SD 2 ), the ratio of SD 1 to SD 2 (SD 1 /SD 2 ), standard deviation of the RR intervals (SDRR), standard deviation of the successive differences of the RR intervals (SDSD), and autocorrelation function of the RR intervals (rRR) were calculated from the Poincaré plot (Brennan et al., 2001).In addition to the previously reported conventional descriptors, new cluster descriptors were calculated by analyzing the Poincaré plots as an image (Duong et al., 2009).In brief, Poincaré plot images (Figure 2A) were thresholded at the gray level of 0 (Figure 2B).Second, binary plot images were eroded once with the 3×3 structuring element, reconstructed with respect to the binary plot, and seed-filled (Figure 2C) (Serra, 1984).A cluster was defined as an isolated group of connected points containing more than 12 points that effectively represented highly correlated heartbeat interval events.Then, the cluster was described by using shape attributes, such as form factor and minor to major axis ratio; texture attributes, such as entropy and contrast; and location attributes, such as the number of clusters on the diagonal line.Torpedo, island, and multi-sided lobe patterns were classified at the accuracy of 99% using the combined set of conventional and new Poincaré plot features.

Circadian rhythm analysis
Circadian rhythm (CR) represents physiological phenomena repeatedly occurring during a time period of approximately 24 h.In humans, almost every physiological function displays CR to some degree and the mechanism can be endogenous, exogenous or a combination of both.The cardiovascular system also exhibits a pronounced CR which is influenced by both external stimuli and endogenous homoeostatic control mechanism with the latter playing a more important role than the former (Guo & Stein, 2002).Circadian variations are found in a number of electrophysiological parameters such as heart rate, QT interval, sinus node recovery time, and atrial refractory periods (Guo & Stein, 2002).Although detailed genetic or epigenetic mechanisms are not fully understood, numerous cardiovascular and cerebrovascular diseases show circadian variations.For example, paroxysmal and persistent atrial arrhythmia occurs more frequently during the evening time (Mitchell et al., 2002), whereas sudden cardiac death (Savopoulos et al., 2006) and myocardial ischemia (Li, 2003) occur predominantly in the morning.In previous studies, we reported circadian variations of HRV features in NSR and PAF subjects (Kim et al., 2008a) and CR parameters such as amplitudes, phase, and shift obtained from a least square fitting of sinusoidal functions to various HRV features.On the day of onset of PAF, CR of PAF subjects were affected and significantly different from those of NSR subjects.The CR parameters obtained from the non-episodic data for that day were used to detect PAF patients with an accuracy of 84.6% (Olemann & Kim, 2011).To obtain CR parameter patterns of HRV features, a non-linear curve-fitting method known as the Levenberg-Marquardt algorithm (Levenberg, 1944) was applied (Statgraphics Plus V4.1 Professional System®, Manugistics, Inc., Rockville, MD, USA) to each HRV feature set that consisted of at least six time-points which represents a time span of at least 6 h.The initial values for the regression were set by trial and error and were maintained constant for both the NSR and PAF group.The CR parameters obtained were the amplitude, the shift and the phase, from the following sinusoidal equation used for the curve fitting: where H(t) represents the HRV feature, b represents amplitude, p represents phase (in degrees), a represents shift, t represents time of the day (in hours), and w represents 15/h.The final amplitude was obtained by taking the absolute value.The final phase was modified to restrict its phase value between 0 and 180 using the following formula: While performing the sinusoidal curve fitting, time-points that showed unusual residuals (Studentized residual > |3.0|) were removed from the analysis.

Data analysis 2.3.1 Preparation of learning sets
Before preparing the learning sets, based on the pattern recognition algorithms of Poincaré plots, feature sets showing noisy plots that consist of most of events out of the clusters were also removed since these were caused by the poor contact of electrodes (Figure 3).Feature sets showing island, multi-sided lobe, and fan patterns were removed from learning sets of NSR group since they represent arrhythmic events (Esperer et al., 2006).However, only records showing fan shapes were removed from learning sets of PAF group.To divide learning sets according to the level of physical or mental activity, an arbitrary cut-off value of RMSSD (in sec) was selected to assign each 30-min RR interval to two types of ANS conditions: one representing the activated state of the vagal nervous system (RMSSD > 0.040 sec) and the other representing the suppressed state of the vagal nervous system due to mental or physical activity (RMSSD <0.040 sec) experienced by the subject during daily activity.HRV feature sets from different subject groups were compared by Mann-Whitney signed rank test (Statgraphics Plus V 4.1).Test results were considered significant if the pvalue was less than 0.05 (95% confidence level).
From the prepared learning sets, the CR curve fittings were performed according to the procedure described in the section above.

Derivation of prediction models
Significant HRV and Poincaré plot features were selected from the learning sets by evaluating the worth of a feature based on the value of the chi-squared statistic with respect to the class (Weka 6.1, The University of Waikato, Hamilton, New Zealand).Naïve Bayesian, logistic regression, and support vector machine (SVM) analyses were performed to derive classification models that detect 30-min RR intervals of PAF subjects (Weka 6.1).Because this first classification was based on 30-min segments of RR interval data, it was called the segment-based classification.Once each 30-min segment was annotated, a second phase of classification was derived from a heuristic rule.Because annotation results for each subject were used, it was called the subject-based classification.A similar approach of two-phase classification has been reported previously (Hickey et al., 2006).According to the clinical definition of PAF, a subject would be diagnosed with PAF if at least one 30-min segment contained AF episodes.However, because our learning sets did not contain any segments with AF episodes (fan-shaped plots were removed), the empirical rule in this study was modified to classify subjects as PAF if any of four time periods had two or more segments annotated as PAF within the same time period.Significant CR features were selected from the learning sets by the chi-squared test (Weka 6.1).Naïve Bayesian, logistic regression, and SVM analyses were performed to derive classification models that detect PAF subjects using CR features (Weka 6.1).

Preparation of test sets
Before preparing the test sets, the records showing noisy Poincaré plots were removed from further analysis.Testing sets were divided according to the RMSSD cut-off.The segmentbased classification was performed using the classification algorithm that produced the best accuracy.These outputs were further tested using the subject-based classification rule with different segment sampling methods: two, three, or four 30-min samples per time period to determine the dependence of classification accuracy on the number of sample segments.Cases classified as NSR were tested using the CR classification (Figure 3).

Establishment of PAF prediction models using data from the 24-h study 3.1.1 Segment-based model and subject-based model
A total of 299 and 319 items of RR interval data were obtained from 20 NSR and 24 PAF patients, respectively, and were processed for HRV calculation.HRV features that differed significantly between two types of data were initially screened by the Mann-Whitney test at a significance level of p = 0.05.Most HRV features were higher in the PAF groups except for SD 2 , SI, and HFnu, which were lower (data not shown, p<0.01).Similar outcomes have also been reported in previous studies (Kim et al., 2008a(Kim et al., , 2008b) ) using NSR and AF databases available at Physionet.These HRV features were further ranked based on their contribution to a prediction model (chi-squared feature selection method, Weka 6.1).The top nine HRV features (RMSSD, SDSD, SD 1 , SDRatio, rRR, %Cluster, SDNN, CV, and PNN50) were then used to generate a prediction model that represented the segment-based classification model.The accuracy of each classification method is summarized in Table 2.The logistic regression analysis produced the highest accuracy among three classification algorithms (71.4%) and was selected for the testing.
Based on this observation, a simple heuristic rule was applied to generate the subject-based classification.Two of 24 PAF cases were misclassified as NSR (false negative) and four of 20 NSR cases (false positive) were misclassified (Table 3).False-negative cases showed only one record classified as PAF.All false-positive cases showed fan-shaped Poincaré plots.Table 3. Performance of the subject based classification using the results from the segmentbased classification by logistic regression analysis

The CR model
The phases of the HRV features of the two groups did not show significant differences (data not shown; p > 0.05), thereby suggesting that there was no significant difference in the timepoint of HRV peak occurrence.The CR amplitudes and shifts of rRR, HF, LF, HR and RMSSD showed significant differences between the two groups (data not shown; Mann-Whitney test, p < 0.05).These CR features were further ranked based on their contribution to a prediction model (chi-squared feature selection method, Weka 6.1).The top three CR features (rRR shift, LF amplitude, and HR amplitude) were then used to generate a prediction model that represented the segment-based classification model.These three HRV features were curve-fitted and plotted to determine the circadian change in individual subjects (Figure 4).Logistic regression analysis was performed with the three CR features and produced an accuracy of 86% in predicting PAF cases (sensitivity of 79% classifying 19/24 PAF cases and specificity of 95% classifying 19/20 NSR cases).The false-positive subject (n=1) seemed to show greater fluctuations, whereas false-negative subjects (n=5) showed fewer fluctuations in CR amplitudes.

Prediction of AF recurrence using 48-h recordings
ECG data from 48-h Holter monitoring were first screened for AF episodes using the software in the Holter system and software findings for AF were confirmed by two cardiologists (JHP, JHK).Nineteen of 33 subjects were diagnosed as NSR whose heart rhythms were successfully maintained for 48 h under drug treatment (normal group).Ten subjects were diagnosed as PAF and four as persistent AF (recurrent group; recurrence rate of 42%).Normal and recurrent groups did not differ significantly different in terms of age (Mann-Whitney test, p > 0.05), gender, or other diseases such as diabetes, hypertension, or past strokes (chi-squared test, p > 0.05).
The noise detection algorithm detected four NSR cases and showed that more than 30% of the total ECG data were corrupted by noise possibly due to poor electrode contacts; these were removed from the test sets to avoid misinterpretation.When four ECG segments were sampled from each time period of ECG data, a total of 350 and 270 segments were obtained from 15 normal and 14 recurrent cases, respectively, and were processed for the calculation of HRV and Poincaré plot features.Using the segment-based classification and subject-based rule, 9/10 recurrent PAF patients were correctly identified, whereas 13/15 normal subjects were correctly identified (data not shown).In addition, four persistent AF cases were all correctly identified.An illustrative example for the subject-based rule is described in Figure 5, in which a data point represents the probability density value of the segment based model at a given time-point.Data points for an NSR case remain higher than the empirical cut-off at all time-points, whereas those for a PAF case often fall below the cut-off even during the days when no episodes were evident (Figure 5).Subjects classified as normal cases by the subject-based rule (n = 14) were further tested with the CR-based model.One false-negative case was correctly identified as recurrent and all normal cases were correctly confirmed as normal cases.Therefore, the final classification resulted in a sensitivity of 100% (14/14 recurrent cases) and a specificity of 86% (13/15 normal cases) (Table 4).The same classification procedure was applied to the data sets obtained by sampling two or three segments per time period and classification results are summarized in Table 4.The number of sampled segments did not change the classification outcomes drastically.

Conclusion and discussion
I n t h i s s t u d y , w e h a v e d e v e l o p e d a n e w method for predicting PAF subjects using intermittently sampled ECG data and applied it to the identification of recurrent AF cases.
The proposed method consists of an empirical rule and a CR-based classification.The empirical rule alone identified nearly 93% of recurrent cases (13/14 cases) and 86% of normal cases (13/15 cases).Because our aim was not to miss any recurrent cases, normal cases classified by the empirical rule were re-tested using the CR-based prediction model.The false-negative case was correctly identified as a recurrent case thus achieving a 100% sensitivity and no false-positive cases were generated thus maintaining the 86% specificity.
Our results suggest that intermittently sampled ECG data could be used to detect the increased likelihood of PAF episodes.Since previous PAF prediction methods relied on the change during the transition from NSR to PAF, ECG needs to be analyzed continuously not to miss transition periods.Furthermore, somewhat higher degree of false positive errors may be expected in an actual implementation of long term ECG analysis since the likelihood of not having subsequent PAF episodes following a transition period is not known.
Contrary to previous prediction methods, our classification models were not designed to detect the transition period prior to PAF episodes exclusively.Instead, they were aimed to evaluate the likelihood of a 24 hour period when PAF episodes may occur.Thus, the performance of our proposed method can be less sensitive to the continuity of ECG data but allows ECG recordings to be sampled over the whole day.Two samples gave results that were as accurate as those of four samples taken during four 6-h time periods (Table 4).Furthermore, our results indicated that the proposed methods tended to show an increased likelihood of detecting PAF cases even during the days when no PAF episodes were evident (Figure 5).Therefore, we conclude that our intermittent sampling strategy is as accurate for predicting recurrent PAF as previously reported methods.The intermittent sampling approach might be more effective in mobile healthcare settings because the sensor may not need to be worn all day, which could effectively reduce many problems caused by poor sensor tolerance, limited battery life, and high data transmission costs of current technology.In general, mobile healthcare technology offers many attractive features such as convenient wearable ECG sensors, real-time feedback of abnormal heart rhythms, and timely intervention in the case of adverse events.However, the continuous measurement of ECG signals might not be ideal as a long-term monitoring solution in mobile healthcare settings because it requires long hours of wearing a sensor that may stigmatize a majority of patients and eventually influence the quality of the data.For example, Holter monitoring was regarded as inconvenient because of hygienic aspects, physical activity, night sleep, and skin reactions (Fensli & Boisen, 2009).Based on our understanding of current advances in low-power bioelectronics (Sarpeshkar, 2010), the proposed intermittent sampling of ECG signals is believed to provide an attractive alternative strategy to long-term monitoring in mobile healthcare settings.It could be specially adapted to work with wireless devices such as wearable sensors and gateway devices that consume battery power at high rates.In addition, the amount of data traffic would be minimized, which is also attractive in countries where wireless data transfer is costly.Thus, in developing a computer-aided prediction method for mobile healthcare settings, it seems important to consider human factors, such as patient acceptance of procedures, or device factors, such as battery time or data transfer costs.It is becoming evident that a significant proportion of cryptogenic stroke is due to intermittent AF.By using 30-day cardiac event monitors, 20% of such strokes was found to be related to AF (Elijovich et al., 2009).Warfarin treatment was given to patients after the detection of intermittent AF (despite no detection of AF on ECG or in-patient telemetry monitoring in the majority of patients).Similarly to the detection of recurrent PAF, prevention of recurrent strokes related to PAF in particular may require long-term ECG monitoring.For these applications, our proposed intermittent sampling method should also be suitable as an initial screening method that generates a real-time alarm or trend report that enables timely intervention.For example, if the incidence of abnormal segments increases, then the ECG sampling strategy may be changed to a continuous monitoring mode to capture the PAF episodes.In this way, patients can be monitored in the long term to determine recurrence after conversion treatment or the origin of strokes, so that conventional Holter monitoring can be complemented or improved.For initial screening purposes, the ECG sensor used in this study can be replaced by a heartbeat sensor because all analytic features were calculated from RR interval data rather than the morphology of ECG signals.Current advances in microelectronics have provided a variety of heartbeat sensors ranging from conventional chest belt type to Doppler radar-based non-contact types.These sensors are usually equipped with a module of wireless data communication so they can transmit the signals to the gateway device that is connected to the internet.Since the patient may experience an arrhythmic episode during physical activity, mobile solutions may enhance the quality of data measured by enabling the patient to carry out normal daily routines.Currently, we are also implementing the CR analysis of HRV features and its related PAF prediction.This prototype system should help us to discover the "real problem" and the users' requirements, demonstrate the actual functionality of a device, and provide many insights on how to design and build a more advanced system that should enable long-term ECG monitoring.The future system is being designed to provide additional benefits for stroke or heart disease rehabilitation patients.

Acknowledgement
We would like to express sincere gratitude to Mrs. Young Mi Choi who participated in patient recruitment and data retrieval, Ms. Yoon Ju Na who developed a batch run software, and Dr. Seung Hwan Kim who initially suggested this collaboration.This work was partly supported by Science and Technology Fundamental Frontier Research fund at KAIST ICC, Korea.

Fig. 2 .
Fig. 2. The Poincaré plot of island pattern (A), its binary plot (B), opening by reconstruction and closing (C), and cluster boundary overlaid onto the binary plot (D).(Permission from Duong et al., 2009).

Fig. 4 .
Fig. 4. Circadian rhythms of selected HRV features were significantly different between NSR and PAF cases.HR: heart rates per min, LF: low frequency area (msec 2 ) after logarithmic transform, and rRR: autocorrelation function of the RR intervals.

Fig. 5 .
Fig.5.Distribution of probability values from the segment-based model applied to two patient cases.Data points in the hollow circle are from a normal case (NSR).Data points in the solid square are from a PAF case on the day when PAF episodes were recorded, whereas those in the hollow square and triangle are from the same case during the previous and following days when no PAF episodes were recorded.The points in the area under the cut-off at 0.25 (broken line) were classified as PAF.

Fig. 7 .
Fig.7.Real-time heartbeat analysis system for remote monitoring of ECG and HRV of patients.Heart rates, energy expenditure, physical activity, HRV (RMSSD), and abnormal heartbeats (Abnormality %) can be monitored by remote clinical staff in real time while a patient is exercising outdoors during daily activity.In the case of advent events, user location data from the global positioning system can be provided to paramedic staff (user's current position is indicated by the avatar on the web-based map) (Cardiomobile is the trademark of Alivetec Technologies Pty. Ltd.The server software was kindly provided to be modified by us.)

Learning on 24 hour NSR and PAF cases
Flow diagram for data processing and analysis used in the study.
The calculated CR parameters were compared between NSR and PAF groups based on median values by the Mann-Whitney test (Statgraphics Plus V 4.1).The results were considered significantly different if p-values were less than 0.05.www.intechopen.com

Table 2 .
Performance of different classification algorithms for the segment based classification using 30-min ECG records.

Table 4 .
Performance of proposed methods for predicting PAF subjects using a subjectbased rule and a CR-based classification algorithm.Performance results were obtained when two, three, or four 30-min ECG records were sampled from each of four time periods during a day.