An Application of Jordan Pi-Sigma Neural Network for the Prediction of Temperature Time Series Signal

Temperature forecasting is mainly issued in qualitative terms with the use of conventional methods, assisted by the data projected images taken by meteorological satellites to assess future trends (Paras et al., 2007). Several criteria that need to be considered when choosing a forecasting method include the accuracy, the cost and the properties of the series being forecast. Considering those criteria, it is noted that such empirical approaches that has been conducted for temperature forecasting is intrinsically costlier and only proficient of providing certain information, which is usually generalized over a larger geographical area (Paras et al., 2007). Despite of involving sophisticated mathematical models to justify the use of empirical rules, it also requires a prior knowledge of the characteristics of the input time-series to predict future events. Not only that, most temperature forecasts today have limited information about uncertainty. Yet, meteorologists often find it challenging to communicate uncertainty effectively. Regardless of the extensive use of the numerical weather method, they are still restricted by the availability of numerical weather prediction products, leading to various studies being conducted for temperature forecasting (Barry & Chorley, 1982; Paras et al., 2007)

Considering the limitations of MLP, therefore in this work, the intention of utilizing the use of higher order neural networks (HONN) which have the ability to expand the input representation space is considered.The Pi-Sigma Neural Network (PSNN) (Shin & Ghosh, 1991-a), a class of HONN, is able to perform high learning capabilities that require less memory in terms of weights and nodes, and at least two orders of magnitude less number of computations when compared to the MLP for similar performance levels, and over a broad class of problems (Ghazali & al-Jumeily, 2009;Shin & Ghosh, 1991-b).
In conjunction with the benefits of PSNN, a new model called Jordan Pi-Sigma Neural Network (JPSN) which posses a Jordan Neural Network architecture (Jordan, 1986) is proposed to perform temperature forecasting.In this regard, the JPSN that managed to incorporates feedback connections in their structure and having the superior properties of PSNN is mapped to function variable and coefficient related to the research area.Consequently, this work is conducted in order to prove that JPSN is suitable for one-stepahead temperature prediction.

Pi-sigma neural network (PSNN)
PSNN is a type of HONN and was first introduced by Shin & Ghosh (1991-a).The basic idea behind the network is due to the fact that a polynomial of input variables is formed by a product ("pi") of several weighted linear combinations ("sigma") of input variables.That is why this network is called pi-sigma instead of sigma-pi.The PSNN exhibits fast learning while greatly reducing network complexity by utilising an efficient form of polynomials for many input variables.This special polynomial form helps the PSNN to dramatically reduce the number of weights in its structure.Figure 1 shows the architecture of PSNN: x is the k th component of x .The weighted inputs are fed to a layer of K linear summing units; ji h is the output if the j th summing units for the i th output i y , viz:

277
where kji w and ji  are adjustable coefficients, and  is the nonlinear transfer function (Shin & Ghosh, 1991-a).The number of summing units in PSNN reflects the network order.By using an additional summing unit, it will increase the network's order by 1 whilst preserving old connections and maintaining network topology.
In PSNN, weights from summing layer to the output layer are fixed to unity, resulting to a reduction in the number of tuneable weights.Therefore, it can reduce the training time.
Sigmoid and linear functions are adopted in the output layer and summing layer, respectively.The use of linear summing units makes the convergence analysis of the learning rules for the PSNN more accurate and tractable (Ghazali & al-Jumeily, 2009;Ghazali et al., 2006).Compared to other HONN models, Shin and Ghosh (1991-b) argued that PSNN can contribute to maintain the high learning capabilities of HONN, needs a much smaller number of weights, with at least two orders of magnitude less number of computations when compared to the MLP for similar performance levels, and over a broad class of problems (Ghazali et al., 2006).Moreover, the PSNN is superior to other HONN in approaching precision computation complexity and has a highly regular structure.Since weights from hidden layer to the output are fixed at 1, the property of PSNN drastically reduces the training time.The applicability of this network was successfully applied for image processing (Hussain and Liatsis, 2002), time series prediction (Knowles, 2005;Ghazali et al., 2011), function approximation ( Shin & Ghosh, 1991-a;Shin & Ghosh, 1991-b), pattern recognition ( Shin & Ghosh, 1991-a), Cryptography (Song, 2008), and so forth.

The properties and structure of Jordan pi-sigma neural network (JPSNN)
The structure of JPSN is quite similar to the ordinary PSNN.The main difference is the architecture of JPSN is constructed by having a recurrent link from output layer back to the input layer.This structure gives the temporal dynamics of the time-series process that allows the network to compute in a more parsimonious way (Hussain & Liatsis, 2002).The architecture of the proposed JPSN is shown in Figure 2 below.xt to the summing units layer are tunable, while weights between the summing unit layers and the output layer are fixed to 1.The tuned weights are used for network testing to see how well the network model generalizes on unseen data. 1 Z  denotes time delay operation.
Let the number of external inputs to the network be M and the number of the output be 1.Let  m xt be the m -th external input to the network at time t.
Meanwhile, weights from   zt to the summing unit are set to 1 in order to reduce the complexity of the network.
The proposed JPSN combines the properties of both PSNN and Recurrent Neural Network (RNN) so that better performance can be achieved.When utilizing the newly proposed JPSN as predictor for one-step-ahead prediction, the previous input values are used to predict the next elements in the data.Since network with recurrent connection holds several advantages over ordinary feedforward MLP especially in dealing with time-series problems, therefore, by adding the dynamic properties to the PSNN, this network may outperform the ordinary feedforward MLP and also the ordinary PSNN.Additionally, the unique architecture of JPSN may also avoid from the combinatorial explosion of higher-order terms as the network order increases.

Learning algorithm of JPSN
The supervised learning used in JPSN can be solved with the standard backpropagation (BP) gradient descent algorithm (Rumelhart et al., 1986), with the recurrent link from output layer back to the input layer nodes.Since the same weights are used for all networks, the learning algorithm starts by initialising the weights to a small random value before training the weights.The JPSN is trained adaptively in which the errors produced are calculated and the overall error function E of the JPSN is defined as:   where

 
L ht represents the activation of the L unit at time t, and y(t) is the previous network output.The unit's transfer function f sigmoid activation function, which bounded of output range of [0,1].

Compute the output error at time (t) using standard Mean Squared Error (MSE) by
minimising the following index: where ik z denotes the output of the kth node with respect to the ith data, and tr n is the number of training sets.This step is completed repeatedly for all nodes on the current layer.
3. By adapting the BP gradient descent algorithm, compute the weight changes: where ji h is the output of summing unit and  is the learning rate.

Update the weight:
ii i ww w   (8) 5. To accelerate the convergence of the error in the learning process, the momentum term,  is added into Equation 3.6.Then, the values of the weight for the interconnection on neurons are calculated and can be numerically expressed as where the value of  is a user-selected positive constant   01    www.intechopen.com6.The JPSN algorithm is terminated when all the stopping criteria (training error, maximum epoch and early stopping) are satisfied.If not, repeat step 1) The utilisation of product units in the output layer indirectly incorporates the capabilities of JPSN while using a small number of weights and processing units.Therefore, the proposed JPSN combines the properties of both PSNN and JNN so that better performance can be achieved.When utilising the newly proposed JPSN as predictor for one-step-ahead, the previous input values are used to predict the next element in the data.Since network with recurrent connection holds several advantages over ordinary feedforward networks especially in dealing with time-series problems, therefore, by adding the dynamic properties to the PSNN, this network may outperformed the MLP and also the ordinary PSNN.
Additionally, the unique architecture of JPSN may also avoid from the combinatorial explosion of higher-order terms as the network order increases.The JPSN has a topology of a fully connected two-layered feedforward network.Considering the fixed weights that are not tuneable, it can be said that the summing layer is not "hidden" as in the case of the MLP.This is by means; such a network topology with only one layer of tuneable weights may reduce the training time.

Temperature prediction with JPSN
Temperature forecasting is the essence of traceability for weather forecasting.Certainly, temperature is a kind of atmospheric time-series data where the time index takes on a predetermined or unlimited set of values.The temperature can have a greater influence in daily life than any other single element on a routine basis.Therefore, some great observation are needed to obtain accuracies for the temperature measurement (Ibrahim, 2002).Temperature forecasting undoubtedly is the most challenging task in dealing with meteorological parameters.It represents not only a very complex nonlinear problem, but also extremely difficult to model.
A great interest in developing methods for more accurate predictions for temperature forecasting has led to the development of several methods which employ the use of physical methods, statistical-empirical methods and numerical-statistical methods (Barry & Chorley, 1982;Lorenc, 1986).These methods, however, constitutionally complex and are limited and restricted to that of numerical weather prediction products (Paras et al., 2007).Considering the downside of those methods, Neural Networks have placed such sophisticated models within the reach of practitioners, and therefore have been successfully applied in many problems.Therefore, in this work, JPSN is used for temperature perdiction in Batu Pahat.
The forecasting horizon for temperature prediction is a one-step-ahead, whereas the output variable represents the temperature measurement of one-day ahead of temperature data.A univariate data of a 5-years daily temperature measurement in Batu Pahat Malaysia, ranging from 2005 to 2009 was used for the simulation (please refer to To purify the data for further processing, it is needed to identify and remove the contaminating effects of the outlying objects on the data.Therefore, in this study, a Max-Min Normalization technique was used so that the data can be distributed evenly and scaled into an acceptable range.In order to avoid computational problems, the range was set between the upper and lower bound of the network transfer function, which often to be the monotonically increasing function,   Cybenko, 1989)  In data normalization, the statistical distribution values for each input and output are roughly uniform.Therefore, removing the outliers should make the data more accurate.
Figure 3 shows the daily temperature data of Batu Pahat region before normalization while Figure 4 shows the daily temperature data of Batu Pahat region after normalization.
Meanwhile, Figure 5 shows the frequency of temperature distribution data for 5-years after normalization process.From Figure 5, it can be seen that the histogram of the transformed data is symmetrical.Therefore, it can be said that the temperature data for Batu Pahat (after normalization) is relatively uniform, and closely follow the normal distribution, thus suitable as the network inputs.For simulation purposes, the data was segregated into time order and was divided into three sets; 50% for training, 25% for testing and 25% for validation, as shown in Table 2.For comparison purposes, the JPSN performances on temperature prdiction will be benchmarked againts that of the ordinary PSNN and the widely known MLP.As there is no rule of thumb for identifying the number of input, a trial-and-error procedure was determined.All networks were built considering 5 different number of input nodes ranging from 4 to 8. A single neuron was considered for the output layer.The number of hidden nodes (for MLP), and the higher order terms (for PSNN and JPSN) were initially started with 2 nodes, and increased by one until a maximum of 5 nodes.

www.intechopen.com
An Application of Jordan Pi-Sigma Neural Network for the Prediction of Temperature Time Series Signal 283

Simulation results
The temperature dataset collected from MMD was used to demonstrate the performance of JPSN by considering a few different network parameters.Generally, the factors affecting the network performance include the learning factors, the higher order terms, and the number of neurons in the input layer.

 
, were chosen based on extensive simulations done by trial-and-error procedure.
The above discussions have shown that some network parameters may affect the network performances.In conjunction with that, it is necessary to illustrate the robustness of JPSN by comparing its performance with the ordinary PSNN and the MLP.Table 3 to Table 5 show the average results from 10 simulations for the JPSN, the ordinary PSNN and the MLP, respectively.As it can be noticed, Table 3 which shows the results for temperature prediction using JPSN reveals that the 2 nd order network, with 4 inputs demonstrates the best results using all measuring criteria except for the number of epochs.Meanwhile, In order to compare the predictive performance of the three models, 2.9 10   for the MLP.Moreover, it can be seen that JPSN reached higher value of SNR.Therefore, it can be said that the network can track the signal better than PSNN and MLP.Apart from the MAE and SNR, it is verified that JPSN exhibited lower prediction errors, in terms of NMSE and MSE on the out-of-sample dataset.This indicates that the network is capable of representing nonlinear function better than the two benchmarked models.In the case of learning speed, particularly on the number of epoch utilized, PSNN converged much faster than the JPSN and MLP.However, JPSN reached a smaller number of epoch when compared to the MLP.On the whole, the performance of JPSN gives a gigantic comparison when compared to the two benchmarked models.

Network
For demonstration purpose, the models' performance on their NMSE is depicted in Figure 6.It shows that JPSN steadily gives lower NMSE when compared to both PSNN and MLP.This by means shows that the predicted and the actual values which were obtained by the JPSN are better than both comparable network models in terms of bias and scatter.Consequently, it can be inferred that the JPSN yield more accurate results, providing the choice of network parameters are determined properly.The parsimonious representation of higher order terms in JPSN assists the network to model successfully.
The plots depicted in Figures 7 to 9 present the temperature forecast on the out-of-sample dataset for all network models.As shown in the plots, the blue line represents the trend of the actual values, while the red line represents the predicted values.The predicted values of daily temperature measurement made by all network models almost fit the actual values with minimum error forecast.On the whole, JPSN practically beat out PSNN and MLP by 1.038% and 1.341%, respectively.It is verified that JPSN has the ability to perform an input-output mapping of temperature data as well as better performance when compared to www.intechopen.com

Conclusion
There are many applications and techniques on temperature that was developed in the past.However, limitations such as the accuracy and complexity of the models make the existing system less enviable for some applications.Therefore, improvement on temperature forecasting requires continuous efforts in many fields, including NN.Several methods related to NN, particularly have been investigated and carried out.However, the ordinary feedforward NN, the MLP, is prone to overfitting and easily get stuck into local minima.Thus, to overcome the drawbacks, a new model, called JPSN is proposed as an alternative mechanism to predict the temperature event.The JPSN which combines the properties of PSNN and RNN can benefits the temperature prediction event, which may overcome such drawbacks in MLP.In this chapter, JPSN is used to learn the historical temperature data of Batu Pahat, and to predict the temperature measurements for the next-day ahead.Simulations for the comprehensive evaluation of the JPSN were presented, and the evaluation covering several performance criteria: the NMSE, MSE, SNR, and number of epoch were discussed.Experimental results of JPSN were compared with the ordinary PSNN and the MLP.Results obtained from each model were presented, and on the whole, the proposed JPSN has shown to outperform the ordinary PSNN and MLP on the prediction errors and convergence time.

Fig
Fig. 2. The architecture of JPSN the difference between the actual value expected from each unit i and the predicted value   j y t .Generally, JPSN can be operated in the following steps:For each training example:1.Calculate the output.

Fig. 9 .
Fig. 9. Temperature Forecast made by MLP on Out-of sample Dataset.
An Application of Jordan Pi-Sigma Neural Network for the Prediction of Temperature Time Series Signal www.intechopen.com

Table 1 .
Table 1).The data was obtained from the Central Forecast Office, Malaysian Meteorological Department (MMD).The properties of Batu Pahat Temperature signal www.intechopen.comAn Application of Jordan Pi-Sigma Neural Network for the Prediction of Temperature Time Series Signal 281

Table 2 .
Summary of Temperature Dataset Segregation maximum epoch and the minimum error, which were set to 3000 and 0.0001 respectively.In order to assess the performance of all network models, four measurement criteria, namely the number of epoch, Mean Squared Error, Normalized Mean Squared Error, and Signal to Noise Ratio are used.Convergence is achieved when the output of the network meets the earlier mentioned stopping criteria.By considering all in-sample dataset that have been trained, the best value for the momentum term 0.2

Table 3 .
Average Result of JPSN for One-Step-Ahead Prediction.

Table 4 .
Average Result of PSNN for One-Step-Ahead Prediction.

Table 5 .
Average Result of MLP for One-Step-Ahead Prediction.

Table 6 .
Comparison on the Best Single Simulation Results for JPSN, PSNN and MLP.
Table 6 presents the best simulation results for JPSN, PSNN and MLP.Over all the training process, JPSN obtained the lowest MAE, which is 0.063458; while the MAE for PSNN and MLP were 0.063471 and 0.063646, respectively.By considering the MAE, it shows that JPSN is able to make a very close forecasts to the actual output in analysing the temperature.In this