Open access

Neural Forecasting Systems

Written By

Takashi Kuremoto, Masanao Obayashi and Kunikazu Kobayashi

Published: 01 January 2008

DOI: 10.5772/5272

From the Edited Volume

Reinforcement Learning

Edited by Cornelius Weber, Mark Elshaw and Norbert Michael Mayer

Chapter metrics overview

6,703 Chapter Downloads

View Full Metrics

1. Introduction

Artificial neural network models (NN) have been widely adopted on the field of time series forecasting in the last two decades. As a kind of soft-computing method, neural forecasting systems can be built more easily because of their learning algorithms than traditional linear or nonlinear models which need to be constructed by advanced mathematic techniques and long process to find optimized parameters of models. The good ability of function approximation and strong performance of sample learning of NN have been known by using error back propagation learning algorithm (BP) with a feed forward multi-layer NN called multi-layer perceptron (MLP) (Rumelhart et. al, 1986), and after this mile stone of neural computing, there have been more than 5,000 publications on NN for forecasting (Crone & Nikolopoulos, 2007).

To simulate complex phenomenon, chaos models have been researched since the middle of last century (Lorenz, 1963; May, 1976). For NN models, the radial basis function network (RBFN) was employed on chaotic time series prediction in the early time (Casdagli, 1989). To design the structure of hidden-layer of RBFN, a cross-validated subspace method is proposed, and the system was applied to predict noisy chaotic time series (Leung & Wang, 2001). A two-layered feed-forward NN, which has its all hidden units with hyperbolic tangent activation function and the final output unit with linear function, gave a high accuracy of prediction for the Lorenz system, Henon and Logistic map (Oliveira et. al, 2000).

To real data of time series, NN and advanced NN models (Zhang, 2003) are reported to provide more accurate forecasting results comparing with traditional statistical model (i.e. the autoregressive integrated moving average (ARIMA)(Box & Jankins, 1976)), and the performances of different NNs for financial time series are confirmed by Kodogiannis & Lolis (Kodogiannis & Lolis, 2002). Furthermore, using benchmark data, several time series forecasting competitions have been held in the past decades, many kinds of NN methods showed their powerful ability of prediction versus other new techniques, e.g. vector quantization, fuzzy logic, Bayesian methods, Kalman filter or other filtering techniques, support vector machine, etc (Lendasse et. al, 2007; Crone & Nikolopoulos, 2007).

Meanwhile, reinforcement learning (RL), a kind of goal-directed learning, has been generally applied in control theory, autonomous system, and other fields of intelligent computation (Sutton & Barto, 1998). When the environment of an agent belongs to Markov decision process (MDP) or the Partially Observable Markov Decision Processes (POMDP), behaviours of exploring let the agent obtain reward or punishment from the environment, and the policy of action then is modified to adapt to acquire more reward. When prediction error for a time series is considered as reward or punishment from the environment, one can use RL to train predictors constructed by neural networks.

In this chapter, two kinds of neural forecasting systems using RL are introduced in detail: a self-organizing fuzzy neural network (SOFNN) (Kuremoto et al., 2003) and a multi-layer perceptron (MLP) predictor (Kuremoto et al., 2005). The results of experiments using Lorenz chaos showed the efficiency of the method comparing with the results by a conventional learning method (BP).

Advertisement

2. Architecture of neural forecasting system

The flow chart of neural forecasting processing is generally used by which in Fig. 1. The tth step time series data y ( t ) can be embedded into a new n-dimensional space x   ( t ) according to Takens Theorem (Takens, 1981). Eq. (1) shows the detail of reconstructed vector space which serves input layer of NN, here τ is an arbitrary delay. An example of 3-dimensional reconstruction is shown in Fig. 2. The output layer of neural forecasting systems is usually with one neuron whose output y ^ ( t + 1 ) equals prediction result.

x ( t ) = ( x 1 ( t ) ,   x 2 ( t ) ,   ,   x n ( t ) )          = ( y ( t ) ,   y ( t τ ) , , y ( t ( n 1 ) τ ) E1

Figure 1.

Flow chart of neural forecasting methods.

There are various architectures of NN models, including MLP, RBFN, recurrent neural network (RNN), autoregressive recurrent neural network (ARNN), neuro-fuzzy hybrid network, ARIMA-NN hybrid model, SOFNN, and so on. The training rules of NNs are also very different not only well-known methods, i.e., BP, orthogonal least squares (OLS), fuzzy inference, but also evolutional computation, i.e., genetic algorithm (GA), particle swarm optimization (PSO), genetic programming (GP), RL, and so on.

Figure 2.

Embedding a time series into a 3-dimensional space.

2.1. MLP with BP

MLP, a feed-forward multi-layer network, is one of the most famous classical neural forecasting systems whose structure is shown in Fig. 3. BP is commonly used as its learning rule, and the system performs fine efficiency in the function approximation and nonlinear prediction.

For the hidden layer, let the number of neurons is K, the output of neuron k is H k , then the output of MLP is obtained by Eq. (2) and Eq. (3).

y ^ ( t + 1 ) = f ( k = 1 K w y k H k ) E2
H k = f ( i = 1 n w k i x i ( t ) ) E3

Figure 3.

A MLP with n input neurons, one hidden layer, and one neuron in output layer using BP training algorithm.

Here w y k , w k i represent the connection of kth hidden neuron with output neuron and input neurons, respectively. Activation function f (u) is a sigmoid function (or hyperblolic tangent function) given by Eq. (4).
f ( u ) = 1 1 + exp ( β u ) E4

Gradient parameter β is usually set to 1.0, and to correspond to f (u), the scale of time series data should be adjusted to (0.0, 1.0).

BP is a supervised learning algorithm, using sample data trains NN providing more correct output data by modifying all of connections between layers. Conventionally, the error function is given by the mean square error as Eq. (5).

E ( W ) = 1 S t = 0 S 1 ( y ( t + 1 ) y ^ ( t + 1 ) ) 2 E5

Here S is the size of train data set, y (t+1) is the actual data in time series. The error is minimized by adjusting the weights according to Eq. (6), Eq. (7) and Eq. (2), Eq. (3).

W ( w y k , w i k ) n e w = α W ( w y k , w i k ) o l d η Δ W ( w y k , w i k ) E6
Δ W ( w y k , w i k ) = ( E / w y k , E / w i k ) E7

Here α is a discount parameter (0.0< α 1.0), η is the learning rate (0.0 < η 1.0). The training iteration keeps to be executed until the error function converges enough.

Figure 4.

A MLP with n input neurons, two hidden layers, and one neuron in output layer using RL training algorithm.

2.2. MLP with RL

One important feature of RL is its statistical action policy, which brings out exploration of adaptive solutions. Fig. 4 shows a MLP which output layer is designed by a neuron with Gaussian function. A hidden layer consists of variables of the distribution function is added. The activation function of units in each hidden layer is still sigmoid function (or hyperbolic tangent function) (Eq. (8)-(10)).

μ = 1 1 + exp ( β 1 R k w μ k ) E8
σ = 1 1 + exp ( β 2 R k w σ k ) E9
R k = 1 1 + exp ( β 3 x i ( t ) w k i ) E10

And the prediction value is given according to Eq. (11).

π ( y ^ ( t + 1 ) , w , x ( t ) ) = 1 2 π σ exp { ( y ^ ( t + 1 ) μ ) 2 2 σ 2 } E11

Here β 1 , β 2 , β 3 are gradient constants, w ( w μ k , w σ k , w k i ) represents the connection of kth hidden neuron with neuron μ,σ in statistical hidden layer and input neurons, respectively. The modification of w is calculated by RL algirthm which will be described in section 3.

2.3. SOFNN with RL

A neuro-fuzzy hybrid forecasting system, SOFNN, using RL training algorithm is shown in Fig. 5. A hidden layer consists of fuzzy membership functions B i j ( x i ( t ) ) is designed to categorize input data of each dimension in x ( x 1 ( t ) , x 2 ( t ) ,..., x n ( t ) ) , t = 1, 2,..., S (Eq. (12)).

Figure 5.

A SOFNN with n input neurons, three hidden layers, and one neuron in output layer using RL training algorithm.

The fuzzy reference λ k , which calculates the fitness for an input set x ( t ) , is executed by fuzzy rules layer (Eq. 13).

B i j ( x i ( t ) ) = exp { ( x i ( t ) m i j ) 2 2 σ i j 2 } E12
λ k ( X ( t ) ) = i = 1 n B i c ( x i ( t ) ) E13

Where i = 1, 2,..., n, j means the number of membership function which is 1 initially, m i j , σ i j are the mean and standard deviation of jth membership function for input x i ( t ) , c means each of membership function which connects with kth rule, respectively. c j, ( j = 1, 2,..., l ), and l is the maximum number of membership functions. If an adaptive threshold of B i j ( x i ( t ) ) is considered, then the multiplication or combination of membership functions and rules can be realized automatically, the network owns self-organizing function to deal with different features of inputs.

The output of neurons μ , σ in stochastic layer is given by Eq. (14), Eq. (15) respectively.

μ = k λ k w μ k k λ k E14
σ = k λ k w σ k k λ k E15

Where w μ k , w σ k are the connections between μ , σ and rules, and μ , σ are the mean and standard deviation of stochastic function π ( y ^ ( t + 1 ) , w , x ( t ) ) whose description is given by Eq. (11). The output of system can be obtained by generating a random data according this probability function.

Advertisement

3. SGA of RL

3.1. Algorithm of SGA

A RL algorithm, Stochastic Gradient Ascent (SGA), is proposed by Kimura and Kobayashi (Kimura & Kobayashi, 1996, 1998) to deal with POMDP and continuous action space. Experimental results reported that SGA learning algorithm was successful for cart-pole control and maze problem. In the case of time series forecasting, the output of predictor can be considered as an action of agent, and the prediction error can be used as reward or punishment from the environment, so SGA can be used to train a neural forecasting system by renewing internal variable vector of NN (Kuremoto et. al, 2003, 2005).

The SGA algorithm is given below.

Step 1. Observe an input x ( t ) from training data of time series.

Step 2. Predict a future data y ^ ( t + 1 ) according to a probability π ( y ^ ( t + 1 ) , w , x ( t ) ) .

Step 3. Receive the immediate reward r t by calculating the prediction error.

r t = { r i f | y ^ ( t + 1 ) y ( t + 1 ) | ε r i f | y ^ ( t + 1 ) y ( t + 1 ) | ε E16
Here r , ε are evaluation constants greater than or equal to zero.

Step 4. Calculate characteristic eligibility e i ( t ) and eligibility trace D ¯ i ( t ) .

e i ( t ) = w i ln { π ( y ^ ( t + 1 ) , w , x ( t ) ) } E17
D ¯ i ( t ) = e i ( t ) + γ D ¯ i ( t 1 ) E18

Here γ ( 0 γ 1 ) is a discount factor, w i denotes ith internal variable vector.

Step 5. Calculate Δ w i ( t ) by Eq. (19).

Δ w i ( t ) = ( r t b ) D ¯ i ( t ) E19

Here b denotes the reinforcement baseline.

Step 6. Improve policy by renewing its internal variable w by Eq. (20).

w w + α s Δ w ( t ) E20

Here Δ w ( t ) = ( Δ w 1 ( t ) , Δ w 2 ( t ) , , Δ w i ( t ) , ) denotes synaptic weights, and other internal variables of forecasting system, α s is a positive learning rate.

Step 7. For next time step t+1, return to step 1.

Characteristic eligibility e i ( t ) , shown in Eq. (17), means that the change of the policy function is concerning with the change of system internal variable vector (Williams, 1992). In fact, the algorithm combines reward/punishment to modify the stochastic policy with its internal variable renewing by step 4 and step 5. The finish condition of training iteration is also decided by the enough convergence of prediction error of sample data.

3.2. SGA for MLP

For the MLP forecasting system described in section 2.2 (Fig. 4), the characteristic eligibility

e i ( t ) of Eq. (21)-(23) can be derived from Eq. (8)-(11) with the internal viable w μ k , w σ k , w k i

respectively.

e w μ k = w μ k ln ( π ) = { ln ( π ) } μ μ w μ k         = β 1 R k μ ( 1 μ ) ( y ^ ( t + 1 ) μ ) σ 2 E21
e w σ k = w σ k ln ( π ) = { ln ( π ) } σ σ w σ k        = β 2 R k ( 1 σ ) 1 σ ( ( y ^ ( t + 1 ) μ ) 2 σ 2 1 ) E22
e w k i = w k i ln ( π ) = { ln ( π ) } μ μ R k R k w k i + { ln ( π ) } σ σ R k R k w k i        = β 3 x i ( t ) ( 1 R k ) ( w μ k e w μ k + w σ k e w σ k ) E23

The initial values of w μ k , w σ k , w k i are random numbers in (0, 1) at the first iteration of training. Gradient constants β 1 , β 2 , β 3 and reward parameters r, ε denoted by Eq. (16) have empirical values.

3.3. SGA for SOFNN

For the SOFNN forecasting system described in section 2.3 (Fig. 5), the characteristic eligibility e i ( t ) of Eq. (24)-(27) can be derived from Eq. (11)-(15) with the internal viable w μ k , w σ k , m i j , σ i j respectively.

e w μ k = w μ k ln ( π ) = { ln ( π ) } μ μ w μ k         = y ^ ( t + 1 ) μ σ 2 λ k k λ k E24
e w σ k = w σ k ln ( π ) = { ln ( π ) } σ σ w σ k         = 1 σ ( ( y ^ ( t + 1 ) μ ) 2 σ 2 1 ) λ k k λ k E25
e m i j = m i j ln ( π )        = k { ( { ln ( π ) } μ μ λ k + { ln ( π ) } σ σ λ k ) λ k B i j } B i j m i j       = k { ( y ^ ( t + 1 ) μ σ 2 w μ k μ k λ k + 1 σ ( ( y ^ ( t + 1 ) μ ) 2 σ 2 1 ) w σ k σ k λ k ) λ k B i j } x i m i j σ i j 2 B i k E26
e σ i j = σ i j ln ( π )        = k { ( { ln ( π ) } μ μ λ k + { ln ( π ) } σ σ λ k ) λ k B i j } B i j σ i j       = k { ( y ^ ( t + 1 ) μ σ 2 w μ k μ k λ k + 1 σ ( ( y ^ ( t + 1 ) μ ) 2 σ 2 1 ) w σ k σ k λ k ) λ k B i j } ( x i m i j ) 2 σ i j 3 B i k E27

Here membership function B i k is described by Eq. (12), fuzzy inference λ k is described by Eq. (13). The initial values of w μ k , w σ k , m i j , σ i j are random numbers included in (0, 1) at the first iteration of training. Reward r, threshold of evaluation error ε denoted by Eq. (16) have empirical values.

Advertisement

4. Experiments

A chaotic time series generated by Lorenz equations was used as benchmark for forecasting experiments which were MLP using BP, MLP using SGA, SOFNN using SGA. Prediction precision was evaluated by the mean square error (MSE) between forecasted values and time series data.

4.1. Lorenz chaos

A butterfly-like attractor generated by the three ordinary differential equations (Eq. (28)) is very famous on the early stage of chaos phenomenon study (Lorenz, 1969).

{ o ( t ) = δ p ( t ) δ o ( t ) p ( t ) = o ( t ) q ( t ) + φ o ( t ) p ( t ) q ( t ) = o ( t ) p ( t ) ϕ q ( t ) E28

Here δ ,   φ ,   ϕ are constants. The chaotic time series was obtained from dimension o(t) of Eq. (29) in forecasting experiments, where Δ t = 0.005 , δ = 16.0 , φ = 45.92 , ϕ = 4.0 .

{ o ( t + 1 ) = o ( t ) + Δ t σ ( p ( t ) o ( t ) ) p ( t + 1 ) = p ( t ) Δ t ( o ( t ) q ( t ) φ o ( t ) + p ( t ) ) q ( t + 1 ) = q ( t ) + Δ t ( o ( t ) p ( t ) ϕ q ( t ) ) E29

The size of sample data for training is 1,000, and the continued 500 data were served as unknown data for evaluating the accuracy of short-term (i.e. one-step ahead) prediction.

4.2. Experiment of MLP using BP

It is very important and difficult to construct a good architecture of MLP for nonlinear prediction. An experimental study (Oliveira et. al, 2000) showed the different prediction results for Lorenz time series by the architecture of n : 2n : n : 1, where n denotes the embedding dimension and the cases of n = 2, 3, 4 were investigated for different term predictions (long-term prediction

Figure 6.

Prediction results after 2,000 iterations of training by MLP using BP.

Figure 7.

Prediction error (MSE) in training iteration of MLP using BP.

For short-term prediction here, a three-layer MLP using BP and 3 : 6 : 1 structure shown in Fig. 3 was used in experiment, and time delay τ =1 was used in embedding input space. Gradient constant of sigmoid function β = 1.0, discount constant α = 1.0, learning rate η = 0.01,

Figure 8.

One-step ahead forecasting results by MLP using BP.

and the finish condition of training was set to E(W) < 5.0 × 10 4 . The prediction results after training 2,000 times are shown in Fig. 6, and the change of prediction error according to the iteration of training is shown in Fig. 7. The one-step ahead prediction results are shown in Fig. 8. The 500 steps MSE of one-step ahead forecasting by MLP using BP was 0.0129.

4.3. Experiment of MLP using SGA

A four-layer MLP forecasting system with SGA and 3 : 60 : 2 : 1 structure shown in Fig. 4 was used in experiment, and time delay τ =1 was used in embedding input space. Gradient

Figure 9.

Prediction results before iteration by MLP using SGA.

Figure 10.

Prediction results after 5,000 iterations of training by MLP using SGA.

constants of sigmoid functions β 1 = 8.0,   β 2 = 18.0,   β 3 = 10.0 , discount constant γ = 0.9, learning rate α w i j = α w σ k = 2.0 × 10 6 ,   α w μ k = 2.0 × 10 5 , the reward was set by Eq. (30), and the

finish condition of training was set to 30,000 iterations where the convergence E(W) could be observed. The prediction results after 0, 5,000, 30,000 iterations of training are shown in Fig. 9, Fig. 10 and Fig. 11 respectively. The change of prediction error during training is shown in Fig. 12. The one-step ahead prediction results are shown in Fig. 13. The 500 steps MSE of one-step ahead forecasting by MLP using SGA was 0.0112, forecasting accuracy was 13.2% upped than MLP using BP.

Figure 11.

Prediction results after 30,000 iterations of training by MLP using SGA.

Figure 12.

Prediction error (MSE) in training iteration of MLP using SGA.

r t = { 4.0 E 4 i f | y ^ ( t + 1 ) y ( t + 1 ) | 0.1 4.0 E 4 i f | y ^ ( t + 1 ) y ( t + 1 ) | 0.1 E30

Figure 13.

One-step ahead forecasting results by MLP using SGA.

4.4. Experiment of SOFNN using SGA

A five-layer SOFNN forecasting system with SGA and structure shown in Fig. 5 was used in experiment, time delay τ =2 was used in 3, 4, or 5-dimensional embedding input spaces. Initial value of weight w μ k had random values in (0.0, 1.0), w σ k = 0.5,   m i j = 0.0,   σ i j = 15.0 and discount γ = 0.9, learning rate α m i j = α w σ i j = α w σ k = 3.0 × 10 6 ,   α w μ k = 2.0 × 10 3 , the reward r was set by Eq. (31), and the finish condition of training was also set to 30,000 iterations where the convergence E(W) could be observed. The prediction results after training are shown in Fig. 14, where the number of input neurons was 4 and data scale of results was modified into (0.0, 1.0). The change of prediction error during the training is shown in Fig. 15. The one-step ahead prediction results are shown in Fig. 16. The 500 steps MSE of one-step ahead forecasting by SOFNN using SGA was 0.00048, forecasting accuracy was 95.7% and 96.3% upped than the case by MLP using BP and by MLP using SGA respectively.

r t = { 1.5 i f | y ^ ( t + 1 ) y ( t + 1 ) | 1.5 1.5 i f | y ^ ( t + 1 ) y ( t + 1 ) | 1.5 E31

Figure 14.

Prediction results after 30,000 iterations of training by SOFNN using SGA.

Figure 15.

Prediction error (MSE) in training iteration of SOFNN using SGA.

Figure 16.

One-step ahead forecasting results by SOFNN using SGA.

Figure 17.

The number of membership function neurons of SOFNN using SGA increased in training experiment.

Figure 18.

The number of rules of SOFNN using SGA increased in training experiment.

One advanced feature of SOFNN is its data-driven structure building. The number of membership function neurons and rules increased with samples (1,000 steps in training of experiment) and iterations (30,000 times in training of experiment), which can be confirmed by Fig. 17 and Fig. 18. The number of membership function neurons for the 4 input neurons was 44, 44, 44, 45 respectively, and the number of rules was 143 when the training finished.

Advertisement

5. Conclusion

Though RL has been developed as one of the most important methods of machine learning, it is still seldom adopted in forecasting theory and prediction systems. Two kinds of neural forecasting systems using SGA learning were described in this chapter, and the experiments of training and short-term forecasting showed their successful performances comparing with the conventional NN prediction method. Though the iterations of MLP with SGA and SOFNN with SGA in training experiments took more than that of MLP with BP, both of their computation time were not more than a few minutes by a computer with 3.0GHz CPU.

A problem of these RL forecasting systems is that the value of reward in SGA algorithm influences learning convergence seriously, the optimum reward should be searched experimentally for different time series. Another problem of SOFNN with SGA is how to tune up initial value of deviation parameter in membership function and the threshold those

were also modified by observing prediction error in training experiments. In fact, when SOFNN with SGA was applied on an neural forecasting competition “NN3” where 11 time series sets were used as benchmark, it did not work sufficiently in the long-term prediction comparing with the results of other methods (Kuremoto et. al, 2007; Crone & Nikolopoulos, 2007). All these problems remain to be resolved, and it is expected that RL forecasting systems will be developed remarkably in the future.

Advertisement

Acknowledgments

We would like to thank Mr. Yamamoto A. and Mr. Teramori N. for their early work in experiments, and a part of this study was supported by MEXT-KAKENHI (15700161) and JSPS-KAKENHI (18500230).

References

  1. 1. Box G. E. P. Jenkins G. 1970 Time series analysis: Forecasting and control. Holden-Day, 100816211043 Francisco
  2. 2. Casdagli M. 1989 Nonlinear prediction of chaotic time series. Physica D: Nonlinear Phenomena. 35 335 356
  3. 3. Crone S. Nikolopoulos K. 2007 Results of the NN3 neural network forecasting competition. The 27th International Symposium on Forecasting. Program, 129
  4. 4. Engle R. F. 1982 Autoregressive conditional heteroscedasticity with estimates of the variance of U. K. inflation. Econometrica. 50 987 1008
  5. 5. Kimura H. Yamamura M. Kobayashi S. 1996 Reinforcement learning in partially observable Markov decision process: A stochastic gradient ascent (in Japanese). Journal of Japanese Society for Artificial Intelligent, 761 768
  6. 6. Kimura H. Kobayashi S. 1998 Reinforcement learning for continuous action using stochastic gradient ascent. Intelligent Autonomous Systems, 288 295
  7. 7. Kodogiannis V. Lolis A. 2002 Forecasting financial time series using neural network and fuzzy system-based techniques. Neural computing & applications. 11 90 102
  8. 8. Kuremoto T. Obayashi M. Yamamoto A. Kobayashi K. 2003 Predicting chaotic time series by reinforcement learning. Proceedings of the 2nd International Conference on Computational Intelligence, Robotics and Autonomous Systems (CIRAS’03), Singapore
  9. 9. Kuremoto T. Obayashi Kobayashi K. 2005 Nonlinear prediction by reinforcement learning. In: Lecture Notes in Computer Science, 3644 1085 1094 , Springer, 03029743 1611-3349 (Online), Berlin
  10. 10. Kuremoto T. Obayashi Kobayashi K. 2007 Forecasting time series by SOFNN with reinforcement learning. The 27th International Symposium on Forecasting. Program, 99
  11. 11. Lendasse A. Oja E. Simula O. Verleysen M. 2007 Time series prediction competition: The CATS benchmark. Neurocomputing. 70 2325 2329
  12. 12. Leung H. Lo T. Wang S. 2001 Prediction of noisy chaotic time series using an optimal radial basis function. IEEE Transaction on Neural Networks. 12 1163 1172
  13. 13. Lorenz E. N. 1963 Deterministic nonperiodic flow. Journal of the atmosphere Sciences. 20 130 141
  14. 14. May R. M. 1976 Simple mathematical models with very complicated dynamics. Nature, 261 459 467
  15. 15. Oliveira K. A. Vannucci A. Silva E. C. 2000 Using artificial neural networks to forecast chaotic time series. Physica A. 284 393 404
  16. 16. Rumelhart D. E. Hinton G. E. Williams R. J. 1986 Learning representation by back-propagating errors. Nature. 232 9 533 536
  17. 17. Sutton R. S. Barto A. G. 1998 Reinforcement learning: an introduction. The MIT Press, 0-26219-398-1
  18. 18. Takens F. 1981 Detecting strange attractor in turbulence. Lecture Notes in Mathematics, 898 366 381 , Springer-Verlag, Berlin
  19. 19. Williams R. J. 1992 Simple statistical gradient following algorithms for connectionist reinforecement learning. Machine Learning, 8 229 256
  20. 20. Zhang G. P. 2003 Time series forecasting using a hybrid ARIMA and neural network model. Neurocomputing, 50 159 175

Written By

Takashi Kuremoto, Masanao Obayashi and Kunikazu Kobayashi

Published: 01 January 2008