This chapter provides the theoretical principle of fractal dimension analysis applied for a brain-computer interface (BCI). Fractal geometry is a mathematical tool for dealing with complex systems. A method of estimating its dimension has been widely used to describe objects in space, since it has been found useful for the analysis of biological data. Moreover, fractal dimension (FD) is one of most popular fractal features. The term waveform applies to the shape of a wave, usually drawn as instantaneous values of a periodic quantity versus time. Any waveform is an infinite series of points. Aside of classical methods such as moment statistics and regression analysis, properties such as the Kolmogorov-Sinai entropy , the apparent entropy  and the FD  have been proposed to deal with the problem of pattern analysis of waveforms. The FD may convey information on spatial extent and self similarity and self affinity . Unfortunately, although precise methods to determine the FD have already been proposed [5,6], their usefulness is severely limited since they are computer intensive and their evaluation is time consuming. Recently, the FD is relatively intensive to data scaling and shows a strong correlation with the human movement of EEG data [7-9]. The time series with fractal nature are to be describable by the functions of fractional Brownian motion (fBm), for which the FD can easily be set. Waveform FD values indicate the complexity of a pattern or the quantity of information embodied in a waveform pattern in terms of morphology, spectra, and variance. This chapter, we investigate most widely popular six algorithms to be feature patterns for a BCI in which algorithms are in time and frequency domain approaches. However, many methods in evaluating FDs have been developed. As the existing algorithms, they present different characteristics therefore the optimum condition respects to the requirements of BCI application should be seriously considered.
2. Experimental paradigm -
In this study, we tested 7-male and 3-female. Their ages were 21-32 years (mean age, 25.3 years; SD ±3.6), with 180 trials per subject (30 trails per task). At the beginning of each EEG recording session, the subjects were instructed to relax with their eyes open for 30 s. The EEG data from this period was used as the baseline for the tasks. The duration of one trial was 6 s throughout the experiment. Each trail in the experiment was divided into two periods, a relaxing period of 0-3 s and an imaging period of 3-6 s. When the subjects had their eyes open on a blank screen, they would be in a relaxed state of mind (relaxing period). In the imaging period, the subjects imagined with eyes open, as represented by arrows. This study focuses on the motor movement functions in the brain. Therefore, imagination of motors movements are used as the tasks throughout this experiment. To assist the subject imagine the task without difficulty, we provide the acting stimuli in accordance to the given tasks. Explicit indicators are also shown on a monitor for informing subject during the experiment, all selected tasks are listed in Table 1.
3. Fractal analysis and methods
In this chapter, most widely popular six algorithms in time and frequency domain analysis have been addressed; the box-counting method (BCM), Higuchi‘s method (HM), variance fractal method (VFD), detrended fluctuation analysis (DFA), power spectral density analysis (PSDA), and critical exponent method (CEM). To measure the fractality in short time intervals of time-sequential data from one end of the waveform to the other sequentially, the dynamical changes in the FDs with respect the time series based on the time-dependent fractal dimensions (TDFD) were observed. For the classification process, two new feature parameters on the basis of fractal dimension; Kullback–Leibler (K-L) divergence and the different expected values were presented. In experimental results, DFA was selected to evaluate fraction dimensions on the basis of TDFDs [11, 36]. The reason is that, we found that the DFA provides fast computation time and also presents reasonable values in terms of accuracy and variability when comparison with each other.
3.1. Time series-based analysis
3.1.1. Box-Counting Method (BCM)
One of the most common methods for calculating the FD of a self-similar fractal is the box-counting method (BCM). The definition of BCM is a bounded set in the Euclidian -space that it composes self-similar property , by covering a structure with boxes of radii, , the FD can be determined by
We then repeat this process with several different radii. To implement this algorithm, number of contained box, , is computed from the difference between the maximum and minimum amplitudes of the data divide by the changed radius, as
where is a total number of contained boxes, represents the EEG time series with length , is a radius by changing a step of within the -th subdivision window, and integer-part function denoted by . To obtain the FD, least-square linear fitted line corresponds to the slope of the plot versus is applied.
3.1.2. Variance Fractal Dimension (VFD)
This method is determined by the Hurst exponent, , whose calculation was divided from the properties of the fBm data. The basic idea of calculation is based on the power law relationship between the variance of the amplitude increments of the input time series, which was produced by a dynamic process over time. The main advantage of VFD was its support of the real-time computation . We also selected the VFD for estimating the FD of EEG data. The amplitude increments of a datum over a time interval adhere to the following power law relationship , the Hurst exponent can be calculated by using a log-log plot then given by
The variance in each window per stage k can be calculated as follows
The least-square linear fitted line corresponds to the slope of the plot and , the Hurst exponent is computed as where s is the obtained slope then the FD can be estimated as
The process of calculating the FD essentially involves segmenting the entire input time series data into numerous subsequence (or window). The values represents the integer range chosen such that each window of size contains a number of smaller windows of size .
3.1.3. Higuchi’s Method (HM)
FD is another measure of data complexity, generally evaluated in phase space by means of correlation dimension. Higuchi proposed an algorithm for the estimation of FD directly in time domain without reconstructing the strange attractor . The HM also gives reasonable estimate of the FD in the case of short time segment. The HM algorithm based on the given finite time series , a new time series, , are constructed by the following equation
where both m and k are integers and they indicate the initial time and the time interval, respectively. The length, , of each curves is computed as
The length of curve for time interval , is computed as the average of the curves. A relationship of this algorithm is therefore we apply the least-squares fitting line of versus , the negative slope of the obtained line is calculated giving the estimate of the FD, .
3.1.4. Detrended Fluctuation Analysis (DFA)
The idea of DFA was invented originally to investigate the long-range dependence in coding and noncoding DNA nucleotide sequence . Due to simplicity in implementation, the DFA is now becoming the most important method in the field of fractal analysis . Therefore the DFA method was also applied to FD estimation in this study.
This integrated series is divided into non-overlapping intervals of length n. In each interval, a least squares line is fit to the data (representing the trend in the interval). The series is then locally detrended by subtracting the theoretical values given by the regression. For a given interval length , the characteristic size of fluctuation for this integrated and detrended series is calculated by
This computation is repeated over all possible interval lengths (in practice, the minimum length is around 10, and the maximum is a half-length of input data, giving two adjacent intervals). In this experiment, we use the power of 2 based length for input EEG data therefore in this case we can range for . A relationship is expected, as where is expressed as the slope of a double logarithmic plot versus . As PSD, Then can be converted into the Hurst exponent and the estimated FD according as.
3.2. Frequency-based analysis
3.2.1. Power Spectral Density Analysis (PSDA)
This method is widely used for assessing the fractal properties of time series, the method based on frequency domain analysis, and also works on the basis of the 1/f-like power scaling which can be obtained by the fast Fourier transform (FFT) algorithm. It is called a power-law relationship of Mandelbrot and van Ness  can be expressed as follows
where is the frequency, is the correspondence power at a given frequency, and is the scaling exponent which is estimated by calculating the negative slope of the linear fit relating to The PSDA was applied to raw EEG data to classify the EEG characteristics. This study used the enhanced version of PSDA that initially proposed by Fougère  and modified by Eke . To improve the stability of spectral estimates, raw EEG data of each individual trial were first detrending by subtracted the mean of the data series from each value, and then each value is multiplied by a parabolic window is given as below
Finally the scaling exponent was estimated by the least-square fitting of log-log domain for the frequencies lower than 1/8 of the maximum sampling rate, i.e., less than 64 Hz in this study. The Hurst exponent can be determined as where and the estimated FD can be computed by the following equation
3.2.2. Critical Exponent Method (CEM)
This method was initially proposed by Nakagawa . The CEM has been applied in the physiological data analysis and featuring [22-24]. The PSD of observed fBm data, , in the frequency domain is determined as
The moment, , and the moment exponent, , of the PSD is determined as
where , , and is the upper frequency variable which was normalized to the lower bound of the integration region as 1. Hence taking the logarithm of moment and the third order derivative can be written as
We then determine the critical value which satisfies for Eq. (18) is equal to zero. Finally, the FD can be estimated by
The PSDA and CEM used the FFT algorithm for transforming the EEG time series data into frequency components.
4. Performance assessment
Since high computation time and low variability are the requirement of the BCI system, the optimal algorithm for evaluating fractal dimension should be examined carefully. In this experiment, we selected two signals that fractal dimension value can be easily set for assessing. The usefulness of output results can be helped us know which algorithm is suited for applying it as the feature extractor.
4.1. Assessing with artificially generated signals
where is the Gamma function and is the Hurst exponent where . is a stochastic process. To simulate the fBm signal, we can simply implement the above definition to discrete time series at n-th index by Riemann’s type sums as follows:
In case , Eq.(4.22) can be reduced, we then obtain
where , and we set and . The fractal dimension of this signal is given by .
4.2. Conditions for assessing
A summary of the selected parameter values in this experiment is shown in Table 2. The selection of signal lengths that are powers of two was motivated by the requirements of frequency-based algorithms (PSDA and CEM). Output performances of each algorithm were shown in Fig. 1.
4.3. Results of performance assessing
In order to test the effect of signal lengths on estimation and each algorithm was applied on the entire series. The estimated fractal dimension values are then averaged at the change of . Before utilizing those of algorithms, we should regard as the following things:
Computation time. Short computation time is the requirement of the BCI applications. The fBm and Syn signals are performed to complete the entire series on a laptop 1.6 GHz Pentium with memory of 512 MB. Table 3 shows the results of the average computation time.
Accuracy. To compare with the theoretical FD value (true FD value), we used a mean-squared error (MSE) value, which can be defined by
|BCM||Step size of radius|
|VFD||Step size of window|
|HM||Maximum interval length|
|DFA||Step size of interval length|
|PSDA||FFT point; Maximum frequency||;|
|CEM||FFT point; Step size of moment exponent||;|
4.4. Time-Dependent Fractal Dimensions (TDFD)
One of common measure of irregularity in fluctuations as time-sequential data may be the fractal dimension. Fractal dimensions estimate the degree of freedom of fluctuations in a signal. To measure the fractality in short time intervals of time-sequential data from one end of the waveform to the other sequentially, we may observe the dynamical changes in the fractal dimensions with respect the time series. These fractal dimensions, namely, the time-dependent fractal dimensions (TDFD) .
In this study, EEG signal was sampled at 512 Hz (or points) whose duration is 6 seconds. We set a window function is a rectangular type, window size = 512 points (1 s), moving window with intervals = 10 points (19.5 ms) because the temporal resolution of EEG is millisecond or even better . The number of obtained points can compute from where is a signal length, is a window length (), and is an interval. Thus, we then obtained 257 points of fractal dimension. The time series-based algorithms of fractal dimension present more effective than frequency-based algorithms .
4.5. Evaluation of features by Kullback-Leibler divergence
In probability theory and information theory, the Kullback–Leibler divergence  (or called K-L divergence) is a measure of the difference between two probability distributions: from an actual probability distribution to an arbitrary probability distribution .
For probability distributions and of a discrete random variable is defined to be
Important properties of the K-L divergence are
The different expected values between imaging and relaxing periods are also proposed to use as the featuring parameter together with the K-L divergence. In general, if is random variable defined on a probability space, then the different expected value of two random variables can be defined by
In this study, we applied the K-L divergence and different expected values for finding the features between relaxing and imaging periods. According to Eq.(28), is regarded as relaxing period, and is regarded as imaging period. The probability distributions can be approximated by normalizing histogram of the data. The output patterns of K-L divergences versus the different expected value are also shown in Fig. 4. These obtained results can be classified by the neural network. More explanations of this classification process are in chapter 5.
5. Neural network classifier
This study is concerned with the most common class of neural networks (NNs) called backpropagation algorithm. Backpropagation is an abbreviation for the backwards propagation of error. The learning process is processed based on the Levenberg–Marquardt (LM) backpropagation method  the Hessian matrix can be approximated as the following equation
and the gradient can be determined as
where is the Jacobian matrix that contains first derivatives of the network error with respect to weights and biases, and is a vector of network errors.
The LM algorithm can use for updating the vector of weights, , by
where is an identity matrix, is a learning rate, and is an iteration step. This algorithm is particularly well suited for training NN with MSE. Moreover, the LM algorithm has been shown to be much faster than backpropagation in a variety of applications .
5.1. Cross-validation procedure
Then, the featured parameters typically perform in terms of fractal properties. The NN was selected as classifier and it includes adaptive self-learning that was learnt to produce the correct classifications based on a set of training examples via cross validation process. A tenfold cross validation method was used to increase the reliability of the results . For each experiment, we divided all the data into three sets, namely, the training set (50% of the entire data), cross validation set (20% of the entire data), and testing set (30% of the entire data). According to the tasks, a set of desired values for network training are shown in Table 8. The MSE rate is succeeded versus the iteration step is shown in Fig. 6. In Table 9, we compared the proposed method with the conventional feature, namely, band power estimation (BPE) . The training process makes the network to define its decision boundaries. Figures 7 to 8 show a tight decision boundary after training the network.
This chapter discusses the major contributions of this study, 12 electrode channels were used to measure EEG signal over the sensorimotor area of the cortex. The accuracy rates of motor imagery tasks are satisfactory. The error in this classification is due to the ambiguous data between the relaxing and imaging periods, particularly when the subjects have imagined the movements. However, in this experiment, throughout the period of recoding raw EEG signals without training and also without rejecting a bad trial. On contrary, they can be different for different individuals. This makes the EEG signal depend on the measured subject. Even if different persons perform the same tasks and the measurement was done under identical conditions the resulting signals could differ .
The proposed method provides 2880 features for each subject, whereas BPE method provides 21600 features. The current features obtained from the proposed method present an easy practicable and reliable method for training the classification algorithms. Nevertheless, the number of features from BPE method can be reduced by changing the number of electrodes. We used the NN to classify the features processed by BPE method. After learning the NN, learning curves and decision boundaries of BPE method is shown in Fig. 9. As the results, output decision functions of classifying features based on BPE method does not perform well and cannot made clearly separation boundaries.
The six algorithms present specific properties in terms of capability and variability. Some algorithms appeared inapplicable for the EEG time series data in this study, such as PSDA and CEM algorithms, for example. The main reason was that two algorithms based on frequency analysis since estimating the Hurst exponent can be utilized by means of the FFT method where the FFT can be substantially applicable in the long-time series data, and it easily declines when the EEG data has been analyzed in a short-time series. However, we can enhance the variability and avoid such drawbacks by increasing the sampling rate at the same period of experiment. In terms of variability, the HM algorithm presents the lowest average values of SD. The DFA provides the fast computation time and also presents reasonable values in terms of accuracy and variability when comparison with each other. Although, VDF gives extremely fast computation time but the VFD itself has the drawbacks in terms of accuracy and robustness. The VFD algorithm appropriates for real-time application, since its consumed computation time was less than 10 ms. Moreover, as proposed algorithms we suggest that the most appropriate algorithm depends on an application purpose; for instance, the HM algorithm was suited for evaluating the FD of EEG data with high precision, whereas HM algorithm extremely consumed the processing time in comparison with the other algorithms. To show the performance of all algorithms, we then compared all estimated FDs with the theoretical value especially computed by short interval, i.e., 27, 28 and 29 points; these results were shown in Figs. 10 to 15.
By selecting the best performance of fractal algorithm, we can obviously improve the rate of classification and can develop the novel methods in terms of fractal properties. Fractal features and the experimental framework can be applied not only binary-command BCIs, but also multi-command BCIs. The measurement of FD from EEG data can be capable not only in particular works, but also in publicity data. The FD characterizes the self-affine property of EEG data and has a direct relation to the different tasks.