Components of Soft Computing for Epileptic Seizure Prediction and Detection

Components of soft computing include machine learning, fuzzy logic, evolutionary computation, and probabilistic theory. These components have the cognitive ability to learn effectively. They deal with imprecision and good tolerance of uncertainty. Components of soft computing are needed for developing automated expert systems. These systems reduce human interventions so as to complete a task essentially. Automated expert systems are developed in order to perform difficult jobs. The systems have been trained and tested using soft computing techniques. These systems are required in all kinds of fields and are especially very useful in medical diagnosis. This chapter describes the components of soft computing and review of some analyses regarding EEG signal classification. From those analyses, this chapter concludes that a number of features extracted are very important and relevant features for classifier can give better accuracy of classification. The classifier with a suitable learning method can perform well for automated epileptic seizure detection systems. Further, the decomposition of EEG signal at level 4 is sufficient for seizure detection.


Introduction
The human brain contains billions of neurons which vibrate and generate oscillatory activity. This neural activity of nervous system is studied through brainwaves. These waves are highly complex and can be recorded using the method called electroencephalography (EEG). An epileptic seizure is a symptom due to abnormal and irregular excessive neuronal activity in the brain. The neuronal activity can be recorded by medical tests. Various methods are available for diagnosing brain diseases. Among those methods, the electroencephalogram test is mainly used for diagnosing epilepsy. EEG includes different types of waveforms with different frequency, amplitude, and spatial distribution. The electrical activity of the brain differs due to different stimuli and physiological variables. An EEG test can provide detailed information about the electrical activity of the brain at the testing time. The neurologist recognizes the brain pattern from the EEG test results to diagnose epilepsy. EEG recordings by visual scanning will take time and are inaccurate for detecting epilepsy [1]. Nowadays, the technology of computer-aided diagnosis (CAD) has been used in hospitals; it cannot replace the doctor, but it can assist the professionals to diagnose the disease accurately. The main aim of the CAD systems is to identify the disease in early stages of its development. The CAD supportive tool is developed by using highly complex recognition techniques and machine learning algorithms. The CAD systems are approved by US Food and Drug Administration. They can reduce the false negative rate of recognition of diseases. Recent research studies have identified that the performance of CAD is better in the clinical environment. Establishing CAD systems in medical practice contains some risk and complexity. Sometimes, the interpretation of given data may not yield 100% accurate result. It provides only secondary opinion to the physicians. Especially in epileptic seizure detection, machine learning is very difficult because of understanding the brainwaves. The patterns of brainwaves are completely unique to individuals. Since 1998, CAD tools have been useful for diagnosing disease. It does not mean that they are meant for diagnostic purposes, but the approved CAD system can provide accurate results. Early diagnosis of disease is very important for saving life. Different information can be extracted by using medical image and signal technologies such as X-ray, computed tomography (CT), positron emission tomography (PET), single positron emission computed tomography (SPECT), magnetic resonance imaging (MRI), ultrasound, EEG, electrocardiography (ECG), electromyography (EMG), etc. for diagnosing diseases like cancer and coronary artery, cardiovascular, and neurological disorders. CAD supports accurate diagnosis in early stages of a chronic disease. Soft computing techniques are used in computer-aided diagnosis and computer-aided detection. In the earlier stage of computational approaches, the problem-solving methods were carried out using conventional mathematics and specific analytical models [2].
The traditional way of computing would be less efficient for problem-solving. In the growth of computational science, researchers focus on soft computing in order to overcome the drawbacks of hard computing. Just like artificial

Components of soft computing
Machine learning, fuzzy logic, evolutionary computation, and probabilistic ideas are the main components of soft computing. The following sections give detailed descriptions of each component.

Machine learning
Problem-solving is a challenging task for intelligent entities. It has been proved that "a machine can learn new things." It can adapt to new situations and has an ability to learn from the storage information. Machine learning techniques include artificial neural networks (ANNs), perceptron, and support vector machine (SVM) whereas evolutionary computations include evolutionary algorithms, meta-heuristic and swam intelligence. Just like human brain, a machine is capable of acquiring knowledge from data. It is developed from the field of AI. In order to build intelligent machines, we need machine learning techniques. These techniques deal with huge data in minimum time. There are different types of machine learning methods. They are as follows: • supervised learning; • unsupervised learning; and • reinforcement learning.
Supervised learning technique is used in majority of analyses. In this technique, the system learns from training examples, whereas in unsupervised learning, the system is challenged to discover some patterns directly from the given data. Classification and regression are two different supervised learning problems. The next section gives detailed description about classification using EEG signals for epileptic seizure detection. Regression gives the statistical relationship between two or more variables. An association rule learning problem and clustering problem are major examples explaining unsupervised learning problems. Association rule learning is based on rule-based machine learning method and used to discover the interesting relationship between variables in a huge database whereas clustering method discovers the patterns from the groupings of given data. Reinforcement learning is the third type of machine learning which learns how to behave in an environment merely by interaction. It is a dynamic way of learning. It learns directly and controls the data (no supervisor). Machine learning algorithms have the ability of learning from data and make predictions and classifications for a model based on the sample Epilepsy -Advances in Diagnosis and Therapy inputs. ANN is a technique composed of artificial neurons (processing units or elements) and mimics the function of the human brain, whereas SVM is based on associate learning method and performs data classification. It separates the data into corresponding groups using hyperplanes. Perceptron and support vector are very similar linear classifiers. A network with no hidden layers is called a single layer perceptron. Back propagation algorithm and perceptron are second-generation neural networks. Back propagation is a technique used to train the neural network in order to minimize the objective function. It can learn from mistakes. It looks for the minimum value of the error function in weight space. The weight that minimizes the error function is then considered to be a solution for the learning problem. "It is a supervised learning method, and is a generalization of the delta rule or gradient descent" [2].Neural networks can be classified as follows: • single-layer neural network; • multi-layer neural network; and • competitive neural network.
The back propagation algorithm works as follows: each neuron has an activation function in the neural network with respect to weights w ij defined as: The sigmoid function with respect to output function is defined as: Therefore, the error functions of each neuron in the output are defined as: where d j denotes the j th element of the desired response vector and the sum of the errors in the output layer from all the neurons is defined as: , the overall error is reduced by using the gradient descendent method. The partial derivative of errors with respect to weight using the delta rule is defined as: (5) where η denotes the learning rate parameter. Eqs. (1) and (2) provide the dependency with respect to output as: Also, From (6) and (7) Therefore, the weight adjustment of each neuron (from (5) and (8)) is: Feed forwarding the inputs, calculating the error, and propagating it back to the previous layers are the main steps of an ANN classifier. The error is identified as the difference between the desired response and actual response of the network. Each classifier is based on some learning method. There are different types of learning methods such as error correction learning, memory-based learning, associative learning, neural net learning, genetic learning, etc. SVM is based on the associative learning method. There are many advantages in SVM. The performance of SVM is very competitive with other methods. A drawback is the problem complexity for large sample sizes. Special optimizers are used for optimization. Basically, SVM is a linear classifier that classifies the two different classes (normal and seizure) efficiently. The features of the two classes are categorized by the labels "−1" and "+1." The features that are extracted from the signal are defined as: where y i denotes the label related to the pattern x i and n refers to the number of samples. Dot product or the scalar product of linear classifier is defined as: This Eq. (11) in the function form is: where w i denotes the weight vector and b refers to the bias. For the case b = 0, the set of vectors in W T (x) = 0 produce a hyperplane through the origin, which divides the features into two classes. The kernel is an algorithm that can produce non-linear decision boundaries. Replacing the normal SVM (linear kernel) dot product with a kernel function defines a Gaussian radial basis function classifier which is expressed as The variables x i and x j represent the two sample data from the dataset. The default sigma value is one that has been associated with all the attributes in the dataset. The features are separated into two different classes with respect to their feature label. ANN and SVM are supervised learning methods. Both have different working patterns. SVM with kernels is highly suitable for non-linear mapping functions. The classification process is important because a machine has to learn how to classify the data into groups [3].

Fuzzy logic
Machine learning, fuzzy logic, and evolutionary computations can be applicable for any decision-making problems. Unlike Boolean logic, fuzzy logic is an approach Epilepsy -Advances in Diagnosis and Therapy 6 that deals with a problem by the level of truth values which lie between 0 and 1. Fuzzy refers to vagueness. The Boolean logic results in true or false for the question (Figure 1) "Is it raining?" but fuzzy logic gives a number in the range from 0 to 1. Here 1.0 represents absolute truth and 0.0 represents absolute false. This is a logic used for fuzziness. It was introduced in 1965 by Lofti A.Zadeh. Fuzzy classifier is a classifier (algorithm) that uses fuzzy logic for classification and prediction problems. It is based on fuzzy sets (membership functions). The data-driven and trial and error (heuristic) approaches are two different approaches of fuzzy logic. An automated system can be designed using these approaches. Among these approaches, data-driven is most essential for the model to learn and update continuously. Fuzzy logic uses trial and error approach in tuning process for obtaining a satisfactory result. It is a technique that can handle imprecise data and especially analyze crisp/standard data. The data-driven approach is similar to event-driven approach and it is well structured. In classification processes, appropriate features are required to train and test the system. The performance of the system depends on selecting the apt features from the data for modeling the detection system. The heuristic method is not an optimal approach for problem-solving. It gives satisfactory solution. Heuristics, hyper-heuristics, and meta-heuristics are commonly used with machine learning and optimization techniques. Mostly, machine learning techniques are heuristic. Genetic algorithm or any optimization technique can be used to get optimal solution for the given problem. Fuzzy if then rule is the simple form of fuzzy rule based classifier. Fuzzy if-then rule statements are the form of fuzzy logic. Any classifier that uses fuzzy logic is fuzzy rule based classifier. These classifiers are well suited for linear model of classification whereas ANN can predict better on test data. Recently, deep learning has been the popular tool for prediction and detection processes. Fuzzy logic gives multi-value answers, whereas in machine learning, the system learns from data especially with the control or supervisor [2].

Evolutionary computation
Evolutionary computation (EC) is a subdiscipline of AI and soft computing. In computational intelligence, evolutionary algorithms are inspired by biological systems and give optimal solution for problems. Meta-heuristic and swarm intelligence may also yield enough good solutions for any optimization problem. EC is a computational intelligence method involved in a lot of optimization techniques for problem-solving methods. It is a subfield of AI. The algorithms of EC are inspired by biological evolution. These algorithms can give highly optimum solutions for any kind of problems. Ant colony optimization, genetic algorithm (GA), genetic programming, self-organization maps, competitive learning, and swarm intelligence are some examples of EC techniques. Genetic algorithm is a technique used for optimization in problem-solving of various fields. It is derived from the natural genetic systems. It gives accurate results, exhibits robustness, and produces optimal solution for the problem.  In computational intelligence, the application program differs among various problems in various fields. GA starts with the production of the initial chromosome in the population. Chromosomes are binary digits representing the control parameters in the coding of the given problem. Like natural reproduction systems, crossover and mutation processes take place for generating a new population. Fitness calculation is evaluated in successive iterations called generations. After several generations, GA selects the best chromosome using probabilistic transition rules and obtains the optimal or closest optimal solution to the problem. In the automated epileptic seizure detection problem, genetic algorithm is used for feature selection. Selecting relevant features is important for the performance of the system [2].

Probabilistic ideas
Both probabilistic ideas and logic are used in probabilistic reasoning in order to handle uncertainty situations. Most of the problems use probability and statistics. "Clean data is greater than more data." Machine learns from data. Quality of data is important rather than quantity of data. Bayesian analysis is one of the most important approaches for probabilistic reasoning. Unknown information or imperfectness is the situation of uncertainty. Bayesian inference is a statistical inference based on Bayes theorem that can be used for accurate prediction. It is very useful when the available data are insufficient for solving the problem. Data analysis is a procedure of evaluating data that are gathered from various sources. The soft computing techniques play a challenging part in data analysis. For example, data mining techniques are especially used for discovering new information from a huge database, whereas soft computing techniques mimic the process of human brain in order to find effective solutions for any NP-complete problem.

Epileptic seizure prediction and detection
There is a link between data analysis and soft computing. Data may be qualitative or quantitative. Quantitative data can give exact solution for the problem. The data are pre-processed once they have been collected. The raw data are transformed effectively for the purpose of analysis in the pre-processing stage. Any type of data has to be initially pre-processed for analysis. The main principle of data pre-processing is to eliminate the irrelevant and redundant data (noise data) in order to get better detection accuracy of the system. In signal processing, the error is referred to as an artifact or noise. Unwanted information can be removed from the raw data using noise reduction. Different types of algorithms are available for data pre-processing. For example, in the case of EEG signal processing for epileptic seizure detection, artifacts can occur from physiological or mechanical sources. Respiratory, cardiac/pulse, eye movement, and electromyography signals are biological artifacts [4]. These artifacts should be recognized and eliminated for proper diagnosis. More than one variety of artifacts can appear in the recorded EEG. Preprocessing is the first step in classification and diagnostics where the artifacts have to be removed. After pre-processing, the signals are filtered and free from noise. These filtered signals are used for feature extraction process in the next step.

Feature extraction/selection and classification
The process that converts the huge samples to a set of features is called feature extraction and feature selection is the process that filters the redundant or irrelevant features. These methods are used to reduce the actual dimension of the given data.
Data are important to build a machine learning model. The performance of the classifier depends on the given data. The noise must be removed from data. Classifier cannot separate the noise from data. Pre-processing is the process that is most important for removing noise. Analysis of EEG signals is important to diagnose epilepsy in clinical practice [3]. Fourier transform-based analysis is suitable for stationary signals. Studies have proved that EEG signals change over time and frequency components. Several time-frequency domain-based methods such as short time Fourier transform, discrete wavelet transform (DWT), and multiwavelet transform can be used to decompose the EEG signals [5]. Removing artifacts from the signal especially in biomedical applications is a challenging task, because it creates some signals and disturbs the epilepsy diagnosis. Pre-processing is the process to remove artifacts, and they can be extracted well by a method called independent component analysis (ICA) [6]. In order to reduce the dimension of the raw data and to find optimal solution, feature extraction process with kernel trick is frequently used [7]. Figure 2 explains the EEG signal classification.
In earlier days, reading and interpretations of the EEG signals were very difficult for a neurophysiologist. This drawback has been overcome in the latest computer technology. EEG is a non-stationary signal and is very difficult to understand by an ordinary person. For EEG signal analysis, features are extracted from the EEG vectors and appropriate features are selected for classification. Feature selection is a subset of feature extraction. The irrelevant and redundant features are eliminated for better performance of the system. Feature selection algorithms can be used to select  appropriate features. Genetic algorithm is an exact tool for feature selection. It can reduce the computing time and space required to run the algorithms. Filter method, Pearson's correlation co-efficient, mutual information, wrapper methods, and greedy forward search are some of the methods used to select features for classification. In machine learning, classification is the process of categorizing the data by training the machine with the class label. For example, labels like "Seizure" or "Normal" are used in the case of supervised learning. The clustering technique also known as grouping technique is based on inherence in unsupervised learning and can handle unlabeled data. An algorithm that maps the data into a particular group is called a classifier.

Warning system in epilepsy
ECG and EEG data are used in seizure detection. Several electronic mobile applications are developed to track seizure information from the patient electronically. The information includes type of seizure, frequency, and duration. The application provides useful data for the epileptologist to treat epilepsy accurately. Already, many applications have been developed and are available on the market. Figure 3 represents the closed-loop warning system for epilepsy.
A new high tech bracelet developed by Netherlands scientists can detect 85% of all severe night time epilepsy seizures. Automated seizure detection methods can overcome some of the difficulties that occur from data collection, patient monitoring, and prediction modeling. Closed-loop system monitors the seizures and can detect, anticipate, and even respond to the real-time information from the patients. These systems have been used in emergency and intensive care settings of medical diagnosis [8].

Review of EEG signal analyses
B. Suguna Nanthini [3] had carried out six different analyses for detecting seizures using EEG signals under supervised learning method. The performance of the system in all the analyses is measured by the confusion matrix method. Online available EEG database (Bonn University Database) and real-time data from the EEG center, Coimbatore, India, are used for EEG signal classification analysis. EEG tests taken from 10 normal and seizure subjects for epileptic seizure detection are used in second database. These signals are examined and used for binary classification as well as for validation. Set A (perfectly normal) and Set E (merely seizure) have been chosen from online database. The first three analyses were carried out in the spatial domain and next three analyses were carried out in the frequency (wavelet) domain. In the first analysis [9], gray-level co-occurrence matrix (GLCM) features namely contrast, correlation, energy, and homogeneity are extracted from the EEG vectors. The system is well trained to identify the exact group and tested for classification of data using ANN classifier. The performance of the system is measured by the confusion matrix. The system achieves 85% accuracy. The same problem is examined with an SVM classifier in the second analysis [10]. The classifier achieves 90% accuracy for EEG signal classification. The computational complexity of analyses 1 and 2 are calculated and shown in the following Table 2.
When the analyses use ANN and SVM classifiers, the space complexity depends on the number of training samples used in the classification process. In the third analysis [11], eight statistical features are added with GLCM features. The EEG signals are segmented and combinations of normal and seizure signals are used for classification process. In extraction process, eight statistical features and four GLCM features are extracted from each of the segmented signal. An SVM classifier with different kernels is used for seizure detection. The computation complexity of analysis 3 is calculated and presented in the following Table 3. The complexity of the model depends on k-fold cross-validation method. The system executes the same learning algorithm k times. It takes different training sets of size (k−1)/k times the size of the original data. In the execution step, each sample is evaluated (k−1) times. The space complexity of the analysis for RBF kernel is (Number of samples) ^2*(Number of features) and for linear kernel is (Number of samples) * (Number of features). ANN with back propagation algorithm [9] and SVM with linear kernel have achieved almost similar results.
EEG signals are non-stationery and can be analyzed better through wavelet transform. Different types of wavelets are available to decompose the signal. The challenging part is to select a suitable wavelet and the level of decomposition of the signal. In the fourth analysis [12], statistical features namely mean, median, mode, standard deviation, skewness and kurtosis and four GLCM features are extracted   from the EEG signal. The performance of the system is measured to select a suitable classifier for seizure detection. ANN and SVM are two classifiers used in the fourth analysis. The wavelets namely db1, db2, and haar are used for signal decomposition. The signal is decomposed up to level 3.

Significance of the analysis
1. Statistical and GLCM features are used to examine the EEG signals separately and further they are combined together as an input to the classifier.
3. On comparison of features (statistical, GLCM, and their combination), wavelets (db1,db2, and haar), and classifiers (ANN and SVM), the analysis concluded that the combination of statistical and GLCM features using SVM classifier gives the best outcome.
To extract maximum information from the EEG signal, entropy features are used in the fifth analysis [13]. There are different types of entropies. In this analysis, Shannon, Renyi, and Tsallis entropies are extracted from the EEG signals. On comparison of entropy features, the analysis concluded that Renyi entropy can achieve successful result. Instead of using only statistical features over the wavelet coefficient, this analysis examines the EEG signals through entropy values obtained from different degrees of orders for classification. When comparing with the existing work, this research uses the extended version of Shannon, namely Renyi and Tsallis to extract the maximum information from each EEG signal vector in terms of probability events. In the sixth analysis [14], EEG signals are examined by combining all the features from the previous analysis. Altogether, 16 features from the methods namely GLCM, statistical, and Renyi entropy features are extracted from the raw EEG and its subbands. DWT (db2) is used for decomposition of the signal at level 4.The approximation and detail co-efficient are analyzed individually with 16 and 8 features, respectively. Genetic algorithm is used for selecting 8 appropriate features. The SVM is used as a classifier. Classification is carried out for seizure detection. Accuracies from 16 and 8 dimension features are compared and it is concluded that relevant features can give better accuracy. Moreover, level 4 is enough for decomposing the signal because the lower frequencies namely delta and theta can be obtained at level 4 of decomposition. Mostly, seizures are identified at lower frequencies; so, level 4 is sufficient for decomposition of the EEG signal. Further, the time to execute the algorithm is reduced and it occupies less memory space for the storage of data parameters. The complexity of this EEG signal analysis is calculated and presented in the following Table 4.
Summary and time complexity of the analyses are shown in Tables 5 and 6 calculations, the analyses prove that the performance of SVM with significant features is good when compared with ANN using large number of features as the input. The major contributions of these analyses in view of the existing work are as follows:

Conclusion
An epileptic seizure is a symptom due to abnormal and excessive irregular neuronal activity in the brain. EEG test is mainly used for diagnosing epilepsy. EEG includes different types of waveforms with different frequency, amplitude, and spatial distribution. Traditional ways of computations would be less efficient for problem-solving. But, soft computing methods can work in an efficient way for discovering solutions from the given data. Components of soft computing are essential for developing automated expert systems. Early diagnosis of disease can save the life of a person. The approved CAD system is able to provide accurate results. Problem-solving is a challenging task for intelligent entities. It has been proved that "a machine can learn new things." It can adapt to new situations and has an ability to learn from the storage information. Supervised learning technique is used in majority of analyses. Fuzzy logic gives multi-value answers, whereas in machine learning, the system learns from data especially with the control or supervisor. In computational intelligence, evolutionary algorithms are inspired by biological systems and give optimal solution for the problem. "Clean data are greater than more data." Machine learns from data. Quality of data is important rather than quantity of data. This chapter gave an introduction about the components of soft computing and classification in machine learning. From the review of analyses, this chapter concludes that relevant features and less number of features can make the classifier perform well. Accuracies are compared in all decomposed signals and proved that level 4 of decomposition is enough for EEG signal classification. At level 4, the lower frequencies (delta and theta) can be analyzed perfectly because seizures occur mostly at lower frequencies. Also, from the analyses, it has been proved that the time required and memory space for data parameters are less.

Conflict of interest
There are no conflicts of interest.