Applications of General Regression Neural Networks in Dynamic Systems Applications of General Regression Neural Networks in Dynamic Systems

Nowadays, computational intelligence (CI) receives much attention in academic and indus- try due to a plethora of possible applications. CI includes fuzzy logic (FL), evolutionary algorithms (EA), expert systems (ES) and artificial neural networks (ANN). Many CI com- ponents have applications in modeling and control of dynamic systems. FL mimics the human reasoning by converting linguistic variables into a set of rules. EA are metaheuristic population-based algorithms which use evolutionary operations such as mutation, cross-over, and selection to find an optimal solution for a given problem. ES are programmed based on an expert knowledge to make informed decisions in complex tasks. ANN models how the neurons are connected in animal nervous systems. ANN have learning abilities and they are trained using data to make intelligent decisions. Since ANN have universal approximation abilities, they can be used to solve regression, classification, and forecasting prob- lems. ANNs are made of interconnected layers where every layer is made of neurons and these neurons have connections with other neurons. These layers consist of an input layer, hidden layer/layers, and an output layer.


Introduction
Nowadays, computational intelligence (CI) receives much attention in academic and industry due to a plethora of possible applications. CI includes fuzzy logic (FL), evolutionary algorithms (EA), expert systems (ES), and artificial neural networks (ANN). Many CI components have applications in modeling and control of dynamic systems. FL mimics the human reasoning by converting linguistic variables into a set of rules. EA are metaheuristic population-based algorithms which use evolutionary operations such as mutation, crossover, and selection to find an optimal solution for a given problem. ES are programmed based on an expert knowledge to make informed decisions in complex tasks. ANN model how the neurons are connected in animal nervous systems. ANN have learning abilities and they are trained using data to make intelligent decisions. Since ANN have universal approximation abilities [1], they can be used to solve regression, classification, and forecasting problems. ANNs are made of interconnected layers where every layer is made of neurons, and these neurons have connections with other neurons. These layers consist of an input layer, hidden layer/layers, and an output layer. ANN have two major types as shown in Figure 1: feed-forward neural network (FFNN) and recurrent neural network (RNN). In FFNN, the data can only flow from the input to hidden layer, while in RNN, the data can flow in any direction. The output of a singlehidden-layer FFNN can be written as where Y is the network output, W HO is the hidden-output layers weights matrix, h is the hidden layer activation function, x is the input vector, W IH is the input-hidden layers weights matrix, b I is the input layer bias vector, and b O is the hidden layer bias vector.
The output of a single-hidden-layer RNN with a recurrent hidden layer can be written as The training of neural networks involves modifying the neural network parameters to reduce a given error function. Gradient descent (GD) [2,3] is the most common ANN training method: where θ are the network parameters, λ is the learning rate, and E is the error function: where N is the number of samples, y is the network output, and t is the network target.

General regression neural network (GRNN)
The general regression neural network (GRNN) is a single-pass neural network which uses a Gaussian activation function in the hidden layer [4]. GRNN consists of input, hidden, summation, and division layers.
The regression of the random variable y on the observed values X of random variable x can be found using where f X; y ð Þis a known joint continuous probability density function.
When f X; y ð Þis unknown, it should be estimated from a set of observations of x and y. f X; y ð Þ can be estimated using the nonparametric consistent estimator suggested by Parzen as follows: where n is the number of observations, p is the dimension of the vector variable, and x and σ are the smoothing factors.
After solving the integration, the following will result:

Previous studies
GRNN was used in different applications related to modeling, system identification, prediction, and control of dynamic systems including: feedback linearization controller [5], HVAC process identification and control [6], modeling and monitoring of batch processes [7], cooling load prediction for buildings [8], fault diagnosis of a building's air handling unit [9], intelligent control [10], optimal control for variable-speed wind generation systems [11], annual power load forecasting model [12], vehicle sideslip angle estimation [13], fault diagnosis for methane sensors [14], fault detection of excavator's hydraulic system [15], detection of time-varying inter-turn short circuit in a squirrel cage induction machine [16], system identification of nonlinear rotorcraft heave mode [17], and modeling of traveling wave ultrasonic motors [18]. Some significant modifications of GRNN include using fuzzy c-means clustering to cluster the input data of GRNN [19], modified GRNN which uses different types of Parzen estimators to estimate the density function of the regression [20], density-driven GRNN combining GRNN, density-dependent kernels and regularization for function approximation [21], GRNN to model time-varying systems [22], adapting GRNN for modeling of dynamic plants [23] using different adaptation approaches including modifying the training targets, and adding a new pattern and dynamic initialization of σ.

GRNN training algorithm
GRNN training is rather simple. The input weights are the training inputs transposed, and the output weights are the training targets. Since GRNN is an associative memory, after training, the number of the hidden neurons is equal to the number of the training samples. However, this training procedure is not efficient if there are many training samples, so one of the suggested solutions is using a data dimensionality reduction technique such as clustering or principal component analysis (PCA). One of the novel solutions to data dimensionality reduction is using an error-based algorithm to grow GRNN [24] as explained in Algorithm 1. The algorithm will check whether an input is required to be included in the training, based on prediction error before training GRNN with that input. If the prediction error without including that input is more than the certain level, then GRNN should be trained with it.

Reducing data dimensionality using clustering
Clustering techniques can be used to reduce the data dimensionality before feeding it to the GRNN. k-means clustering is one of the popular clustering techniques. The k-means clustering algorithm is explained in Algorithm 2. Also, results of comparing GRNN performance before and after applying k-means algorithm are shown in Table 1. Although the training and testing errors will increase, there are large reductions in the network size.
The aim of the algorithm is to minimize the distance objective function:

Reducing data dimensionality using PCA
PCA can be used to reduce a large dataset into a smaller dataset which still carries most of the important information from the large dataset. In a mathematical sense, PCA converts a number of correlated variables into a number of uncorrelated variables. PCA algorithm is explained in Algorithm 3.

GRNN output algorithm
After GRNN is trained, the output of GRNN can be calculated using where D is the Euclidean distance between the input X and the input weights W i , W o is the output weight, and σ is the smoothing factor of the radial basis function.
GRNN output calculation is explained in Algorithm 4.
Other distance measures can be also used such as Manhattan (city block), so (10) will become

Estimation of GRNN smoothing parameter (σ)
Since σ is the only free parameter in GRNN and suitable values of it will improve GRNN accuracy, it should be estimated. Since there is no optimal analytical solution for finding σ, numerical approaches can be used to estimate it. The holdout method is one of the suggested methods. In this method, samples are randomly removed from the training dataset; then using the GRNN with a fixed σ, the output is calculated using the removed samples; then the error is calculated between the network outputs and the sample targets. This procedure is repeated for different σ values. The smoothing parameter (σ) with the lowest sum of errors is selected as the best σ. The holdout algorithm is explained in Algorithm 5.
Other search and optimization methods might be also used to find σ. For instance, genetic algorithms (GA) and differential evolution (DE) are suitable options. Algorithm 6 explains how to find σ using DE or GA. Also, the results of using DE and GA are depicted in Figure 2.
Both of GA and DE can find a good approximation of σ within 100 iterations only; however, DE converges faster since it is a vectorized algorithm.

GRNN vs. back-propagation neural networks (BPNN)
There are many differences between GRNN and BPNN. Firstly, GRNN is single-pass learning algorithm, while BPNN needs two passes: forward and backward pass. This means that GRNN consumes significantly less training time. Secondly, the only free parameter in GRNN is the smoothing parameter σ, while in BPNN more parameters are required such as weights, biases, and learning rates. This also indicates that GRNN quick learning abilities and its suitability for online systems or for system where minimal computations are required. Also, another difference is that since GRNN is an autoassociative memory network, it will store all the distinct input/output samples while BPNN has a limited predefined size. This size growth To show the advantages of GRNN over BPNN, a comparison is held using standard regression datasets built inside MATLAB software [25]. For all the datasets, they are divided 70% for training and 30% for testing. After training the network with the 70% training data, the output of the neural network is found using the remaining testing data. The most notable advantage of GRNN over BPNN is the shorter training time which confirms its selection for dynamic systems modeling and control. Also, GRNN has less testing error which means it has better generalization abilities than BPNN. The comparison results are summarized in Table 3.

GRNN in identification of dynamic systems
System identification is the process of building a model of unknown/partially known dynamic system based on observed input/output data. Gray-box and black-box identification are two common approaches of system identification. In the gray-box approach, a nominal model of a dynamic system is known, but its exact parameters are unknown, so an identifier is used to find these parameters. In the black-box approach, the identification is based only on the data. Examples of black-box identification include fuzzy logic (FL) and neural networks (NN). GRNN can be used to identify dynamic systems quickly and accurately. There are two methods to use GRNN for system identification: the batch mode (off-line training) and sequential mode (online training). In the batch mode, all the observed data is available before the system identification, so GRNN can be trained with a big chunk of the data, while in the sequential mode only a few data samples are available for identification.

GRNN identification in batch training mode
In the batch mode, the observed data should be divided into training, validation, and testing. GRNN will be fed with all the training data to identify the system. Then in the validation stage, the network should be tested with different data, usually randomly selected, and the error is  Figure 3(a-c). The results are accurate with very low error. MSE in training stage is 0.001139 and 0.00258 in the testing stage. Also, the training time was only 0.720 seconds.

GRNN identification in sequential training mode
In sequential training, the data flow once at a time which makes using the batch training procedures impossible. So GRNN should be able to find the system model from only the current and past measurements. So it is a prediction problem. Since GRNN converges to a regression surface even with a few data samples and since it is accurate and quick, it can be used in the online dynamic systems identification.

Sequential training GRNN to identify hexacopter attitude dynamics
To use GRNN in sequential mode, it is preferred to use the delayed output of the plant as an input in addition to the current input as shown in Figure 4. The same data which was used for batch mode is used in the sequential training. The inputs to GRNN are the control values of  rolling, pitching, and yawing and the delayed rolling, pitching, and yawing rates. The results of using GRNN in the sequential training mode are shown in Figure 5(a-c). The results of sequential training are more accurate than the results in batch training.

GRNN in control of dynamic systems
The aim of adding a closed-loop controller to the dynamic systems is either to reach the desired performance or stabilize the unstable system. GRNN can be used in controlling dynamic systems as a predictive or feedback controller. GRNN in control systems can be used as either supervised or unsupervised. When GRNN is trained as a predictive then the controller input and output data are known, so this is a supervised problem. On the other hand, if GRNN is utilized as a feedback controller (see Figure 6) without being pretrained, only the controller input data is known so GRNN have to find the suitable control signal u.

GRNN as predictive controller
To utilize GRNN as a predictive controller, it should be trained with input-output data from another controller. For example, training a GRNN with a proportional integral derivative (PID) controller input/output data as shown in Figure 7. Then the trained GRNN can be used as a controller.

Example 1: GRNN as predictive controller
If we have a discrete time system Liu [26] described as The desired reference is y d k ð Þ ¼ 2 * sin 0:1πt ð Þ.
The perfect control law can be written as To train GRNN as a predictive controller, the system described in (13) and (14) is simulated for 50 seconds. Then the controller output u and the plant output y were stored. GRNN is trained with the plant output as input and the controller output as output. For any time step the plant output is fed to GRNN, and the controller output u is estimated. The estimated controller output by GRNN and the perfect controller output are almost identical as shown in Figure 8. Also, the tracking performance after using GRNN as a predictive controller is very accurate as shown in Figure 9.

GRNN as an adaptive estimator controller
Since GRNN has robust approximation abilities, it can be used to approximate the dynamics of a given system to find the control law especially if the system is partially known or unknown.
Assume there is a nonlinear dynamic system written as where _ x is the derivative of the states, f x; t ð Þ is a known function of the states, b is the input gain, and d is the external disturbance. The perfect control law can be written as If f x; t ð Þ is unknown, then the control law in (16) cannot be found; hence, the alternative is using GRNN to estimate the unknown function f x; t ð Þ. To derive the update law of GRNN weights, let us define the objective function as MSE error function as follows: whereŷ is the estimation of GRNN and y is the optimal value of f x; t ð Þ. To derive the update law of the GRNN weights, the error should be minimized with respect to GRNN weights W: whereŴ is the GRNN current hidden-output layers weights and H is the hidden layer output, so the update law of GRNN weights will be 6.3. Example 2: using GRNN to approximate the unknown dynamics Let us consider the same discrete as in example 1: Figure 9. GRNN tracking performance.
The perfect control law can be written as GRNN is used to estimate the unknown function f k ð Þ. With applying the update law in (19), f k ð Þ is estimated with an acceptable accuracy as shown in Figure 10. MSE between the ideal and the estimated f k ð Þ is 0.0033. The accurate controller tracking performance is also shown Figure 11.

GRNN as an adaptive optimal controller
GRNN has learning abilities which means it is suitable to be an adaptive intelligent controller. Rather than approximating the unknown function in the control law (16), one can use GRNN to approximate the whole controller output as shown in Figure 12. The same update law as in (19) can be used to update GRNN weights to approximate the controller output u.

Example 3: using GRNN as an adaptive controller
Let us consider the same discrete system as in (13): Þþ15 * u k ð Þ Figure 10. Using GRNN to estimate the unknown dynamics.   with the same desired reference y d k ð Þ ¼ 2 * sin 0:1πt ð Þ, but in this case GRNN is used to estimate the full controller output u as shown in Figure 14 and the tracking performance is shown in Figure 13.

MATLAB examples
In this section, GRNN MATLAB code examples are provided.

Basic GRNN Commands in MATLAB
In this example, GRNN is trained to find the square of a given number.
To design a GRNN in MATLAB: Firstly, create the inputs and the targets and specify the spread parameter: Secondly, create GRNN: To view GRNN after creating it: The results are shown in Figure 16.
To find GRNN output based on a given input: The result is 17.
7.2. The holdout method to find σ