Using GRNN with k-means clustering.

## Abstract

Nowadays, computational intelligence (CI) receives much attention in academic and industry due to a plethora of possible applications. CI includes fuzzy logic (FL), evolutionary algorithms (EA), expert systems (ES) and artificial neural networks (ANN). Many CI components have applications in modeling and control of dynamic systems. FL mimics the human reasoning by converting linguistic variables into a set of rules. EA are metaheuristic population-based algorithms which use evolutionary operations such as mutation, crossover, and selection to find an optimal solution for a given problem. ES are programmed based on an expert knowledge to make informed decisions in complex tasks. ANN models how the neurons are connected in animal nervous systems. ANN have learning abilities and they are trained using data to make intelligent decisions. Since ANN have universal approximation abilities, they can be used to solve regression, classification, and forecasting problems. ANNs are made of interconnected layers where every layer is made of neurons and these neurons have connections with other neurons. These layers consist of an input layer, hidden layer/layers, and an output layer.

### Keywords

- applications
- general regression
- neural networks
- dynamic systems

## 1. Introduction

Nowadays, computational intelligence (CI) receives much attention in academic and industry due to a plethora of possible applications. CI includes fuzzy logic (FL), evolutionary algorithms (EA), expert systems (ES), and artificial neural networks (ANN). Many CI components have applications in modeling and control of dynamic systems. FL mimics the human reasoning by converting linguistic variables into a set of rules. EA are metaheuristic population-based algorithms which use evolutionary operations such as mutation, crossover, and selection to find an optimal solution for a given problem. ES are programmed based on an expert knowledge to make informed decisions in complex tasks. ANN model how the neurons are connected in animal nervous systems. ANN have learning abilities and they are trained using data to make intelligent decisions. Since ANN have universal approximation abilities [1], they can be used to solve regression, classification, and forecasting problems. ANNs are made of interconnected layers where every layer is made of neurons, and these neurons have connections with other neurons. These layers consist of an input layer, hidden layer/layers, and an output layer. ANN have two major types as shown in Figure 1: feed-forward neural network (FFNN) and recurrent neural network (RNN). In FFNN, the data can only flow from the input to hidden layer, while in RNN, the data can flow in any direction. The output of a single-hidden-layer FFNN can be written as

where

The output of a single-hidden-layer RNN with a recurrent hidden layer can be written as

The training of neural networks involves modifying the neural network parameters to reduce a given error function. Gradient descent (GD) [2, 3] is the most common ANN training method:

where

where

## 2. General regression neural network (GRNN)

The general regression neural network (GRNN) is a single-pass neural network which uses a Gaussian activation function in the hidden layer [4]. GRNN consists of input, hidden, summation, and division layers.

The regression of the random variable

where

When

where

Substituting (6) into (5) leads to

After solving the integration, the following will result:

### 2.1. Previous studies

GRNN was used in different applications related to modeling, system identification, prediction, and control of dynamic systems including: feedback linearization controller [5], HVAC process identification and control [6], modeling and monitoring of batch processes [7], cooling load prediction for buildings [8], fault diagnosis of a building’s air handling unit [9], intelligent control [10], optimal control for variable-speed wind generation systems [11], annual power load forecasting model [12], vehicle sideslip angle estimation [13], fault diagnosis for methane sensors [14], fault detection of excavator’s hydraulic system [15], detection of time-varying inter-turn short circuit in a squirrel cage induction machine [16], system identification of nonlinear rotorcraft heave mode [17], and modeling of traveling wave ultrasonic motors [18].

Some significant modifications of GRNN include using fuzzy c-means clustering to cluster the input data of GRNN [19], modified GRNN which uses different types of Parzen estimators to estimate the density function of the regression [20], density-driven GRNN combining GRNN, density-dependent kernels and regularization for function approximation [21], GRNN to model time-varying systems [22], adapting GRNN for modeling of dynamic plants [23] using different adaptation approaches including modifying the training targets, and adding a new pattern and dynamic initialization of

### 2.2. GRNN training algorithm

GRNN training is rather simple. The input weights are the training inputs transposed, and the output weights are the training targets. Since GRNN is an associative memory, after training, the number of the hidden neurons is equal to the number of the training samples. However, this training procedure is not efficient if there are many training samples, so one of the suggested solutions is using a data dimensionality reduction technique such as clustering or principal component analysis (PCA). One of the novel solutions to data dimensionality reduction is using an error-based algorithm to grow GRNN [24] as explained in **Algorithm 1**. The algorithm will check whether an input is required to be included in the training, based on prediction error before training GRNN with that input. If the prediction error without including that input is more than the certain level, then GRNN should be trained with it.

#### 2.2.1. Reducing data dimensionality using clustering

Clustering techniques can be used to reduce the data dimensionality before feeding it to the GRNN. k-means clustering is one of the popular clustering techniques. The k-means clustering algorithm is explained in **Algorithm 2**. Also, results of comparing GRNN performance before and after applying k-means algorithm are shown in Table 1. Although the training and testing errors will increase, there are large reductions in the network size.

Dataset | Training error after/before k-means MSE | Testing error after/before k-means MSE | Size reduction % |
---|---|---|---|

Abalone | 0.0177/0.002 | 0.0141/0.006 | 99.76 |

Building energy | 0.047/3.44e-05 | 0.0165/0.023 | 99.76 |

Chemical sensor | 0.241/0.016 | 0.328/0.034 | 97.99 |

Cholesterol | 0.050/4.605e-05 | 0.030/0.009 | 92 |

The aim of the algorithm is to minimize the distance objective function:

#### 2.2.2. Reducing data dimensionality using PCA

PCA can be used to reduce a large dataset into a smaller dataset which still carries most of the important information from the large dataset. In a mathematical sense, PCA converts a number of correlated variables into a number of uncorrelated variables. PCA algorithm is explained in **Algorithm 3**.

Dataset | Training error after/before PCA MSE | Testing error after/before PCA MSE | Size reduction % |
---|---|---|---|

Abalone | 0.197/0.002 | 0.188/0.006 | 99.8 |

Building energy | 0.061/3.44e-05 | 0.049/0.023 | 99.6 |

Chemical sensor | 0.241/0.016 | 0.328/0.034 | 98.3 |

Cholesterol | 0.026/4.605e-05 | 0.028/0.009 | 92 |

### 2.3. GRNN output algorithm

After GRNN is trained, the output of GRNN can be calculated using

where

GRNN output calculation is explained in **Algorithm 4**.

Other distance measures can be also used such as Manhattan (city block), so (10) will become

## 3. Estimation of GRNN smoothing parameter (σ )

Since **Algorithm 5**.

Other search and optimization methods might be also used to find **Algorithm 6** explains how to find

## 4. GRNN vs. back-propagation neural networks (BPNN)

There are many differences between GRNN and BPNN. Firstly, GRNN is single-pass learning algorithm, while BPNN needs two passes: forward and backward pass. This means that GRNN consumes significantly less training time. Secondly, the only free parameter in GRNN is the smoothing parameter

To show the advantages of GRNN over BPNN, a comparison is held using standard regression datasets built inside MATLAB software [25]. For all the datasets, they are divided 70% for training and 30% for testing. After training the network with the 70% training data, the output of the neural network is found using the remaining testing data. The most notable advantage of GRNN over BPNN is the shorter training time which confirms its selection for dynamic systems modeling and control. Also, GRNN has less testing error which means it has better generalization abilities than BPNN. The comparison results are summarized in Table 3.

Type | Dataset | Training time (sec) | Training error (MSE) | Testing error (MSE) |
---|---|---|---|---|

GRNN | Abalone | 0.621 | 0.342 | 0.384 |

BPNN | Abalone | 1.323 | 0.436 | 0.395 |

GRNN | Building energy | 0.630 | 0.0731 | 0.628 |

BPNN | Building energy | 1.880 | 0.1152 | 0.631 |

GRNN | Chemical sensor | 0.701 | 0.888 | 1.316 |

BPNN | Chemical sensor | 1.473 | 0.228 | 1.584 |

GRNN | Cholesterol | 0.801 | 0.037 | 0.172 |

BPNN | Cholesterol | 2.099 | 0.061 | 0.215 |

## 5. GRNN in identification of dynamic systems

System identification is the process of building a model of unknown/partially known dynamic system based on observed input/output data. Gray-box and black-box identification are two common approaches of system identification. In the gray-box approach, a nominal model of a dynamic system is known, but its exact parameters are unknown, so an identifier is used to find these parameters. In the black-box approach, the identification is based only on the data. Examples of black-box identification include fuzzy logic (FL) and neural networks (NN). GRNN can be used to identify dynamic systems quickly and accurately. There are two methods to use GRNN for system identification: the batch mode (off-line training) and sequential mode (online training). In the batch mode, all the observed data is available before the system identification, so GRNN can be trained with a big chunk of the data, while in the sequential mode only a few data samples are available for identification.

### 5.1. GRNN identification in batch training mode

In the batch mode, the observed data should be divided into training, validation, and testing. GRNN will be fed with all the training data to identify the system. Then in the validation stage, the network should be tested with different data, usually randomly selected, and the error is recorded for every validation test. Then the validation process is repeated several times. Usually 10 times is standard. And then the average validation error is found based on all the validation tests. This validation procedure is called k-fold cross validation a standard technique in machine learning (ML) applications. To test the generalization ability of an identified model, a new dataset is used called testing dataset. Based on the model performance in the testing stage, one can decide whether the model is suitable or not.

#### 5.1.1. Batch training GRNN to identify hexacopter attitude dynamics

In this example, GRNN is used to identify the attitude (pitch/roll/yaw) of a hexacopter drone based on real flight test data in the free flight mode. The data consist of three inputs: rolling, pitching, and yawing control values and three outputs: rolling, pitching, and yawing rates. The dataset contains 6691 data samples with a sample rate of 0.01 seconds. A total of 4683 samples are used to train GRNN in the batch mode, and the remaining data samples (2008) are used for testing. The results of hexacopter attitude identification are shown in Figure 3(a–c). The results are accurate with very low error. MSE in training stage is 0.001139 and 0.00258 in the testing stage. Also, the training time was only 0.720 seconds.

### 5.2. GRNN identification in sequential training mode

In sequential training, the data flow once at a time which makes using the batch training procedures impossible. So GRNN should be able to find the system model from only the current and past measurements. So it is a prediction problem. Since GRNN converges to a regression surface even with a few data samples and since it is accurate and quick, it can be used in the online dynamic systems identification.

#### 5.2.1. Sequential training GRNN to identify hexacopter attitude dynamics

To use GRNN in sequential mode, it is preferred to use the delayed output of the plant as an input in addition to the current input as shown in Figure 4. The same data which was used for batch mode is used in the sequential training. The inputs to GRNN are the control values of rolling, pitching, and yawing and the delayed rolling, pitching, and yawing rates. The results of using GRNN in the sequential training mode are shown in Figure 5(a–c). The results of sequential training are more accurate than the results in batch training.

## 6. GRNN in control of dynamic systems

The aim of adding a closed-loop controller to the dynamic systems is either to reach the desired performance or stabilize the unstable system. GRNN can be used in controlling dynamic systems as a predictive or feedback controller. GRNN in control systems can be used as either supervised or unsupervised. When GRNN is trained as a predictive then the controller input and output data are known, so this is a supervised problem. On the other hand, if GRNN is utilized as a feedback controller (see Figure 6) without being pretrained, only the controller input data is known so GRNN have to find the suitable control signal

### 6.1. GRNN as predictive controller

To utilize GRNN as a predictive controller, it should be trained with input-output data from another controller. For example, training a GRNN with a proportional integral derivative (PID) controller input/output data as shown in Figure 7. Then the trained GRNN can be used as a controller.

#### 6.1.1. Example 1: GRNN as predictive controller

If we have a discrete time system Liu [26] described as

The desired reference is

The perfect control law can be written as

To train GRNN as a predictive controller, the system described in (13) and (14) is simulated for 50 seconds. Then the controller output

### 6.2. GRNN as an adaptive estimator controller

Since GRNN has robust approximation abilities, it can be used to approximate the dynamics of a given system to find the control law especially if the system is partially known or unknown.

Assume there is a nonlinear dynamic system written as

where

The perfect control law can be written as

If

where

where

### 6.3. Example 2: using GRNN to approximate the unknown dynamics

Let us consider the same discrete as in example 1:

The desired reference is

where

The perfect control law can be written as

GRNN is used to estimate the unknown function

### 6.4. GRNN as an adaptive optimal controller

GRNN has learning abilities which means it is suitable to be an adaptive intelligent controller. Rather than approximating the unknown function in the control law (16), one can use GRNN to approximate the whole controller output as shown in Figure 12. The same update law as in (19) can be used to update GRNN weights to approximate the controller output

#### 6.4.1. Example 3: using GRNN as an adaptive controller

Let us consider the same discrete system as in (13):

with the same desired reference

#### 6.4.2. Example 4: using GRNN as an adaptive controller

Let us use GRNN to control a more complex discrete plant [27] described as

The desired reference in this case is

The tracking performance of adaptive GRNN is shown in Figure 15.

## 7. MATLAB examples

In this section, GRNN MATLAB code examples are provided.

### 7.1. Basic GRNN Commands in MATLAB

In this example, GRNN is trained to find the square of a given number.

To design a GRNN in MATLAB:

Firstly, create the inputs and the targets and specify the spread parameter:

Secondly, create GRNN:

To view GRNN after creating it:

The results are shown in Figure 16.

To find GRNN output based on a given input:

The result is 17.