The RMSE of the in-sample forecasts.
In this chapter, we evaluate the forecasting performance of the model combination and forecast combination of the dynamic factor model (DFM) and the artificial neural networks (ANNs). For the model combination, the factors that are extracted from a large dataset are used as additional input to the ANN model that produces the factor-augmented artificial neural network (FAANN). Linear and nonlinear forecasts combining methods are used to combine the DFM and the ANN forecasts. The results of the best combining method are compared to the forecasts result of the FAANN model. The models are applied to forecast three time series variables using large South African monthly data. The out-of-sample root-mean-square error (RMSE) results show that the FAANN model yields substantial improvement over the individual and best combined forecasts from the DFM and ANN forecasting models and the autoregressive AR benchmark model. Further, the Diebold-Mariano test results also confirm the superiority of the FAANN model forecast’s performance over the AR benchmark model and the combined forecasts.
- artificial neural network
- dynamic factor model
- factor-augmented artificial neural network model
- forecasts combination
Prediction of economic or financial variable using related independent variables could be done by either using a super model which contains all the available independent variables or using the forecast combination methodology. Generally, it is admitted in the literature of econometrics that the forecast obtained by all the information integrated in one step is much better than the combination of forecast from individual models. For example,  argued that “The best forecast is obtained by combining information sets, not forecasts from information sets. If both models are known, one should combine the information that goes into the models, not the forecasts that come out of the models.” Authors of Refs. [13, 23, 25] expressed similar opinions. As it seems the investigators in this field lean more to prefer the combination of information in one model.
The main questions that arise in researchers’ minds are “To combine or not to combine” and “how to combine.” In this chapter, we are concerned with the question of “combining forecasts from different models or combining information in one model.” This is an area that has been discussed by many researchers but not in detail (see [9, 11, 12, 29, 35, 40]).
Huang  state that “the common belief that combination of information is better than combination of forecasts might be based on the in-sample analysis.” On the contrary, from out-of-sample analysis, they found out that combination of forecasts performs better than combination of information. Many articles typically account for the out-of-sample success of combination of forecasts over combination of information by pointing out various disadvantages that combination of information may possibly possess. For example, (a) in many forecasting situations, particularly in real time, combination of information by pooling all information sets is either impossible or too expensive (see [12, 13, 42]); (b) in a data substantial medium where there are much closed input variables in hand, the super combination of information model may bear from exclusion problem ; and (c) in the absence of linearity and, simple dynamics, building an excellent model using combination of information is more likely to be misspecified . We believe that the above-mentioned points can be maintained through the precise selection of the model that is used to estimate the combined information. In our case we used the artificial neural networks to overcome the nonlinearity problem that can be inherent in the series. On the other hand, the factor model is used to tame the problem of the dimensionality, where a large dataset can be summarized in few numbers of factors.
The seminal work of  opened the door to examine the prediction combination in different fields of studies in economics and finance. Consequently, a new scope in forecasting study has been to combine the forecasts generated by individual models, using different combinations of techniques. This lets the ultimate forecast result to extract strength from the individual forecasting techniques that cannot be carried out by a single method. Empirically, forecast combinations have been used successfully in diverse areas such as forecasting gross national product, currency market volatility, inflation, money supply, stock prices, interest rates, meteorological data, city populations, and outcomes of football games.
Factor models were introduced in macroeconomics and finance by [22, 36]. The literature on the large factor models starts with [19, 37]. Further theoretical advances were made among others [4, 5, 20]. Upon the successive performance of the DFMs in forecasting, factors augmented to other models are introduced. For example, Bernanke et al.  proposed a forecasting model which they called the factor-augmented vector autoregressive (FAVAR) model, a model which merges a factor model with a vector autoregressive component. A factor-augmented vector autoregressive moving average (VARMA) model is suggested by Dufour and Pelletier . Factor-augmented error correction model (FECM) was introduced by Banerjee and Marcellino ; Ng and Stevanovic  proposed a factor-augmented autoregressive distributed lag (FADL) framework for analyzing the dynamic effects of common and idiosyncratic shocks. Babikir and Mwambi  introduced a factor-augmented artificial neural network (FAANN) that showed improved forecasts compared to DFM and AR models.
On the contrary, artificial neural networks (ANNs) have become one of the most scientific projection methods and have been extensively used in different fields of projection goal. Artificial neural networks have several aspects that make them interesting and authentic for projection work. First, ANNs are common functional approximators. Second, ANNs are data-induced self-flexible approach in that there are less a priori presumptions to be stated about the models for the problem under examination; thus, ANN modeling is not similar to classical model-based approaches. Third, an ANN model is a nonlinear model which is in contrast to the conventional time series forecasting models, which postulate linearity of the series under consideration.  demonstrated that systems of the real world are often nonlinear. These advantages of ANNs have attracted attention in time series forecasting and have become a competitive method to traditional time series forecasting methods, and the literature is very vast in this area. The hybrid approach or combining models represent the most important developments in ANNs over the last decade. More hybrid models of ANNs with different forecasting models have been introduced in the recent time, which successfully improve the forecasting performance.  proposed the integration of the generalized linear autoregression (GLAR) model with artificial neural networks in order to obtain accurate forecasts for foreign exchange market.  proposed a hybrid model called SARIMABP that combines the seasonal autoregressive integrated moving average (SARIMA) model and the back-propagation neural network model to predict seasonal time series data.  introduced a hybrid model of ANNs and ARIMA models for forecasting purpose.  introduced a hybrid model where the factors were used as input to the ANN model. The model produced more accurate forecasts compared to ANN and DFM.
In this chapter, through the artificial neural networks framework and factor model, for in-sample and out-of-sample forecasting, we show analytically that combination of forecasts—of dynamic factor model and artificial neural networks—can be outclassed by combination of models (information)—of the factors to be used as additional input variables to the artificial neural networks.
To the best of our knowledge, the evaluation of the forecasting performance of the combination of information or models of factors and ANN—the FAANN—and combination of forecasts of ANN and DFM using different linear and nonlinear combinations is new, and this is the first attempt in general and in South Africa in particular. The empirical results show sizable gains in terms of the forecasting ability of the FAANN compared to both the standard ANN and the DFM and their forecasts combination; in other words it seems that combination of models outperforms combination of forecasts meaning that combination of information could be better than the combination of forecasts.
The remaining of the chapter is formulated as follow: Section 2 in brief expresses the DFM, the ANN, and the FAANN projection models and the combination techniques; Section 3 introduces the data; the results obtained from forecasting models and their combinations are presented in Section 4; finally, Section 5 gives a concise conclusion of the study and some suggestions for future researches.
2. Individual forecasting models and combination methods
In this section, we introduce briefly the symbols, formation, and estimation methods in forecasting models; also, we introduce and discuss the various combining methods.
2.1. Individual forecasting models
2.1.1. The dynamic factor model and the estimation of factors
This subsection handles DFM to get common elements from a large group of variables; then, these common components are used to predict the variables of interest.
Suppose that we have a group of observations, be the N stationary time series variables having observations at times t = 1,…, T, where it is considered that the series have zero mean. Factor model assumes that most of the variation in the dataset can be explained by a small number of factors involved in the vector . We can express the dynamic factor model representation as follows:
where is the common components driven by factor and is the idiosyncratic components for each of the variables. is the portion of that cannot be explained by the common components. is a function of the vectors of ; the operator is a lag polynomial with positive powers on the lag operator L with . The static representation of the model can be rewritten in as
where is a vector of static factors that compose of the dynamic factors and all lags of the factors. From a set of data, there are three different methods of estimating the factors in . These methods were developed by Stock and Watson  hereafter SW  and Forni, Hallin, Lippi, and Reichlin  hereafter FHLR 1 . In the current chapter, we employ the estimation method developed by FHLR. For more details of the dynamic factor model estimation, see Babikir and Mwambi . Thus, the estimated factors will be used to forecast the variables of interest. The forecasting model is specified and estimated as a linear projection of an h-step ahead transformed variable into t-dated dynamic factors. The forecasting model follows the setup in [3, 21, 41] with the form
where represents the dynamic factors that estimated using the method by FHLR, while are the lag polynomials, which are determined by the Schwarz information criterion (SIC). The is an error term. The coefficient matrix for factors and autoregressive terms are estimated by ordinary least squares (OLS) for each forecasting horizon h. To find the estimate and forecast of the AR benchmark, we enforce a condition to Eq. (3), where we set .
2.1.2. The artificial neural network model
The ANN is one of the most popular and successful biological-inspired forecasting methods, which emulate the framework of the human brain; thus, ANNs have gradually achieved immense importance in forecasting among other fields. The ANN model is one of the generalized nonlinear nonparametric models (GNLNPMs). Compared to the traditional econometric models, the advantage of ANNs is that they can handle complex, nonlinear relationships without any prior assumptions about the underlying data-generating process (see ; Figure 1 ).
The properties of the ANN model made the method an attractive alternative to traditional forecasting models. Most importantly, ANN models deal with the limitations of traditional forecasting methods, including misspecification, biased outliers, and assumption of linearity . One of the most recognized ANN structures in time series forecasting problems is the multilayer perceptron (MLP). An MLP is basically a feedforward architecture of an input, one or more hidden, and an output layer. The network structure illustrated in this chapter gives forward network connected with linear neuron activation function. Basically, the input nodes are connected forward to all nodes in the hidden layer, and these latent nodes are joined to the single node in the output layer, as shown in Figure 1 . The inputs in this model serve as the independent variables in the multiple regression model and are joined to the output node—which is similar to the dependent variable—through the latent layer. We follow , in describing the network model. Thus, the model can be specified as follows:
where inputs represent the lagged values of the variable of interest and the output is their forecasts. The and are the bias, and and denote the weights that link the inputs to the latent layer and the latent layer to output, respectively. The and connect the input to the output via the latent layer. The -independent variables are connected linearly to form neurons which then are combined linearly to produce the prediction or output. Eqs. (4)–(6) link inputs to outputs through the hidden layer. The function is a logistic function meaning that The second summation in Eq. (6) shows that we also have a jump connection or skip-layer network that directly links the inputs to the output . The beauty of this ANN structure is that the model combines the true linear model and nonlinear supply-forward neural network. So, if the association between inputs and output is true linear, in this case, the coefficient set , which is skip layer should be significant, in contrast if the association is a nonlinear in nature the jump connections coefficient to be insignificant, while the coefficients set and be highly significant. Certainly, if the association between input and output is mixed, then we watch for all coefficient sets to be significant. For the best network selection in this chapter, beside the minimum error, we use Bayesian information criterion (BIC), which is usually preferred more than the other three criteria, because it has the ability to penalize the extra parameters more severely; mathematically, BIC is given by the following as described in 
where is total number of parameters in the network, is the number of effective observations, is the in-sample observation, is the network misfit function, and is the space of all weights and biases in the network. The in-sample sum of squared error (SSE) is usually used to determine the function Eventually, the optimal model is the model with minimum BIC value.
2.1.3. Factor-augmented artificial neural networks (FAANN)
The FAANN model is a hybrid model of artificial neural network and factor model in order to combine information of factors and lagged values of interested variable to be forecasted for more accurate forecasts in hand. The nonlinear function uses the series, its lag, and factors to formulate the FAANN model that defines as follows:
where is the nonlinear functional form determined via ANN. In the first stage, the factor model is used to extract factors from a large related dataset. In the second stage, a neural network model is used to model the nonlinear and linear relationships existing in factors and original data. Thus, based on the model structure depicted on Figure 2 ,
As previously noted, the (and are the parameters of the model that called the connection weights. As we have stated earlier, and are the numbers of input and hidden nodes, respectively, and is the error term. Figure 2 shows the FAANN model structure used.
2.2. Forecast combining methods
To combine individual forecasts composed by the DFM and ANN models, we used four combination methods. The combining methods involve three linear combining methods (the mean, VACO, and discount MSFE-based methods) and one nonlinear combining method (ANN). Just as some of the combining methods need a holdout period to calculate the weights used to combine individual forecasts, we use the first 24 months of the out of sample as holdout observations. For all combining methods, we form combination forecasts over the post holdout out-of-sample period. Brief details about the above combining methods are given below.
2.2.1. Mean combination method
The mean serves as a convenient criterion as has been shown to achieve better results compared to other fancy methods. For instance, see [10, 21, 32]. Compared to single forecasts, the performance of the simple average combination method is found to be superior (see ). The simple average combination method can be expressed as
where is the combined forecast at time , is the forecast from th individual forecasting model, is the individual forecast weight for model , and is the number of individual models. There are different forms of weights, but generally the weights have to satisfy the condition .
2.2.2. Variance-covariance (VACO) combination method
The method uses the historical achievement of the individual forecasts to compute the weights. Thus, according to the VACO method, the weights determined as follows:
Then, the combined forecast is given by where is the th actual value, is the th forecasting value from th individual forecasting model, and is the total number of out-of-sample points. The weight in Eq. (11) is based on the inverse sum of squared deviation for model i as the numerator, and the denominator is the sum of these inverse contributions from all models. This guarantees that .
2.2.3. Discounted mean square forecast error (DMSFE) combination method
The DMSFE method weights recent forecasts more heavily than distant ones.  suggest that the weights can be calculated as
where is the discount factor with , if and then the DMSFE and VACO methods become one method, which means that the VACO is a special case of the DMSFE. Note that as mentioned above the sum of all weights is equal to one.
2.2.4. Artificial neural network (ANN) combination method
Linearity of combinations of the individual forecasts is the corner stone of linear combination method, but if the individual forecasts are based on nonlinear methods, the combinations are defined to be insufficient or if the true relationship is nonlinear. For the success of the ANN as a combination method over the linear methods, among others, see [15, 25]. Here, we use the same setup used in subsection (2.1.2); the output of combined forecasts can be given by
where is the forecast from th individual forecasting model.
For FAANN and DFM models, data are gathered that include 228 monthly time series 2 of which 203 are collected from South Africa, including the financial, real, nominal sectors, and confidence indices, 2 global variables, and 23 series of major trading partners and global financial markets. The AR criterion model will be used for the data which composed only the variable of interest, namely, deposit rate or share prices for gold mining or long-term interest rate. Thus, besides the national variables, the chapter uses a set of global variables such as gold and crude oil prices. Also, the data incorporate series from financial markets of major trading partners, namely, the United Kingdom, the United States, China, and Japan. For estimation data cover the period January 1992 through December 2006, while the period from January 2007 through December 2011 will be used for goodness of fit for the extracted model. For the degree of integration of all series, the augmented Dickey-Fuller (ADF) test will be used. Difference of the series is used for all nonstationary series in this study. The Schwarz information criterion (SIC) is used in selecting the appropriate lag length in such a way that no serial correlation is left in the stochastic error term. Finally, all series are standardized to have a mean of zero and a constant variance.
4. Evaluation of forecast accuracy
To evaluate the forecast accuracy of model combination or information combination, we used three datasets from South Africa, namely, deposit rate, gold mining share prices, and long-term interest rate, in order to demonstrate the in-sample and out-of-sample appropriateness and effectiveness of the combination of models or information of the DFM and ANN models.
4.1. In-sample forecast evaluation
In this subsection, we evaluate the in-sample predictive power of the combined model forecast—the FAANN model—and other fitted models which include AR (benchmark model), DFM, and ANN and best combined forecasts of the DFM and ANN models. To achieve this, a full sample from January 1992 to December 2011, giving a total of 240 observations of the three datasets—deposit rate, gold mining share prices, and long-term interest rate—is used to estimate the forecasting models in order to check the robustness of in-sample results of competed models and compare it to the AR benchmark model. In-sample forecasting is most useful when it comes to investigate the true relationship between the independent variables and the forecast of dependent variable. Table 1 reports the root-mean-square error (RMSE) 3 of the in-sample forecasting results. The FAANN model outperformed all other models. The maximum reduction in RMSE over the AR benchmark model is around 24%, while the minimum reduction is around 14% considering all variables. Regarding the in-sample forecasting, the FAANN model provides lower RMSE with a reduction of between 9 and 19% for all variables compared to the DFM. Despite that the same factors are augmented to AR and ANN to produce the DFM and the FAANN models, the in-sample results provide significant differences between estimation methods which favor the nonlinear method over the linear one. This is potentially due to the flexibility and property of the ANN models as universal approximators that can be used to different time series in order to obtain accurate forecasts. Comparing the forecasting performance of the FAANN and standard ANN model, the FAANN model produced lower RMSE of 6–19% for all variables. These results indicate the importance of the factors—which summarized 228 related series into five factors—that are used as input to the ANN to produce the FAANN model. Regarding the in-sample forecasting performance of the forecasts of combined models or information—the FAANN model—compared to the best forecast combination of the DFM and ANN models, the FAANN model outperforms the best forecast combination with reduction in the RMSE around 0.01–13% for all variables. These results confirm the superiority of the combination of information or models when a precise estimation method is used to estimate the combined information over the combined forecasts of individual models.
|FAANN||DFM||ANN||AR||Best combined forecasts of DFM and ANN|
|Share prices for gold mining||1.5922||1.7782||1.7787||1.8187||1.6215|
|Long-term interest rate||0.1253||0.1537||0.1546||0.1640||0.1438|
4.2. Out-of-sample forecast evaluation of individual models
In this subsection, we estimate the individual forecasts of the AR, DFM, and ANN and the best combined forecasts of the DFM and ANN models and the FAANN model that combine information of the factors and ANN for the three variables of interest, namely, deposit rate, gold mining share prices, and long-term interest rate, over the in-sample period January 1992 to December 2006 using monthly data, and then compute the out of sample for 3-, 6-, and 12-month-ahead forecasts for the period of January 2007 to December 2011. We employ iterative forecast technique to compute the RMSE for the three forecasting horizons used for the three variables across all of the different models in order to compare the forecast accuracy generated by the models. The starting date of the in-sample period depends on data availability of some important financial series. The out-of-sample period includes the occurrence of the financial crisis that affected economies and financial sectors in particular. Thus, we used this period as out of sample in order to show the suitability and efficiency of the combination of information—FAANN model—to produce accurate forecasts for such data that exhibits inherent nonlinearity or the data that faced fluctuations during the financial crisis. The result of each single variable can be summarized as follows:
Deposit rate forecasting results: for the FAANN model estimation firstly, MATLAB package is used to estimate the factors. Secondly, R software using Broyden, Fletcher, Goldfarb, and Shanno (BFGS) algorithm is used to find and estimate the optimum network architecture. The network with the lowest in-sample RMSE and the Bayesian information criterion (BIC) is selected as best-fitted network, which is composed of eight inputs, five neurons in the hidden layer, and one output (in abbreviated form ). Table 2 reports the RMSEs of the 3-, 6-, and 12-month-ahead and the average of the 3-, 6-, and 12-month-ahead RMSEs. The benchmark for all forecast evaluations is the AR model forecast RMSEs. For both long and short horizons, the FAANN model outperforms all other models followed by the DFM for the short horizons and the ANN in long horizon. The RMSE of the FAANN model decreases as the forecast horizon increases which in turn agreed with  who found that the ANNs significantly forecast better in long horizon. Results reveal that the FAANN performed better with large reductions in RMSE of around 25–46% of the RMSE compared to the AR benchmark model and the reduction on the average RMSE around 37%.
Gold mining share prices: we used the same steps where software and algorism were implemented to the previous variable to estimate the FAANN model. The optimum network is composed of eight inputs, seven neurons in the hidden layer, and one output (in abbreviated form ). Table 3 presents the RMSE results of the FAANN, the DFM, the ANN, and the AR benchmark. As expected based on the in-sample results, the FAANN model stands out in forecasting both short and long horizons with a sizable reduction in RMSE relative to the AR benchmark model of 10–18%. The average of the RMSE reduction over the forecast horizons is 12%. On average the FAANN outperforms the ANN and DFM models with reduction in RMSE of 6 and 8%, respectively.
Long-term interest rate: for estimation purpose the same package and algorism that are used with previous variables are implemented. Thus, the optimal network in abbreviated form is . Table 4 results show the performance of the FAANN model where the model produces more accurate forecasts compared to all competing model on both the single-level forecast horizons and the average of these horizons. Compared to the AR benchmark, the FAANN provides a reduction in the RMSE range from 45–27%, while the average RMSE reduction is around 38%. The performance of the FAANN model stands out followed by the ANN and the DFM with average reduction in RMSE of 9 and 5%, respectively, relative to the AR benchmark model. Comparing the FAANN performance to the ANN and the DFM, the FAANN model RMSE reduction is around 28 and 32%, respectively.
|Model||3 months||6 months||12 months||Average|
|Model||3 months||6 months||12 months||Average|
|Model||3 months||6 months||12 months||Average|
4.3. Out-of-sample forecast evaluation of the combined forecasts of the DFM and ANN models
Table 5 reports the results of combining forecasts of the DFM and ANN models. We aim of using the DFM and ANN models in particular to merge their advantages where the ANN model with its flexibility to account for potentially complex nonlinear relationships that is not easily captured by traditional linear models, and the DFM model can accommodate a large number of variables. Similar to Table 2 , Table 5 shows the ratio of the RMSE for a given combining method to the RMSE for the AR benchmark model. We found that the AR benchmark model poorly performs compared to all combining methods. Generally, the nonlinear ANN combining method outperforms all other combining methods for all variables at all forecasting horizons; hence, it offers a more reliable method for generating forecasts of the variables of interest. Compared to the AR, the nonlinear ANN combining method provides a large reduction in RMSE of around 7–20% relative to the AR model overall forecasting horizons and variables. The nonlinear ANN combining method also beats the best individual forecasting of the DFM and the ANN models for all variables and overall forecasting horizons with sizable reductions in RMSE of around 1–15% of the RMSE of the best individual forecasts. We note in addition that the discount MSFE with δ = 0.9 as a combining method performs nearly as well as the best individual model for all variables and forecasting horizons. The combining method of variance–covariance (VACO), on average, performs less accurate compared to other combining methods’ overall forecasting horizons and variables. We note that the combined forecasts produce more accurate forecasts for long horizons which we attributed to the contribution of the nonlinear model in the combination as nonlinear models produce more accurate forecast in the long horizon.
|Combination method||h = 3||h = 6||h = 12|
|DMSFE, δ = 0.95||0.923||0.903||0.848|
|DMSFE, δ = 0.90||0.905||0.884||0.837|
|Gold mining share prices|
|DMSFE, δ = 0.95||0.945||0.941||0.937|
|DMSFE, δ = 0.90||0.945||0.942||0.936|
|Long-term interest rate|
|DMSFE, δ = 0.95||0.956||0.953||0.954|
|DMSFE, δ = 0.90||0.951||0.952||0.935|
4.4. Comparison of forecasting performance of combination of models or information and combination of forecasts
Here, we compare the forecasting performance of the combination of models (information)—the FAANN model—with the best forecast combinations of the ANN and DFM models. Table 6 presents the RMSE ratios of the FAANN model and the best forecast combination to the AR benchmark model over the out-of-sample period. Compared to the DFM, the results indicate that the FAANN model generates accurate forecasts for all variables and with all forecast horizons. The improvement of the FAANN model is compared to the DFM between 2 and 10% reduction in RMSE for all variables and horizons. Thus, these results indicate the superiority of augmentation of factors to nonlinear method (FAANN) over the linear one (DFM) across the three different series and three different time horizons.
|Forecasting model||h = 3||h = 6||h = 12|
|AR (benchmark model)||0.1862||0.1949||0.2314|
|Combined forecasts of DFM and ANN||0.907||0.882||0.835|
|Gold mining share prices|
|AR (benchmark model)||1.7743||1.7924||1.8187|
|Best combined forecasts of DFM and ANN||0.921||0.929||0.911|
|Long-term interest rate|
|AR (benchmark model)||0.2052||0.2140||0.2308|
|Best combined forecasts of DFM and ANN||0.827||0.815||0.804|
To confirm the RMSE results, the test of equal forecast accuracy of Diebold and Mariano  is used to evaluate forecasts. The test of equal forecast accuracy employed here is given by where is the mean difference of the squared prediction error and is the estimated variance. Here, denotes the forecast errors from the FAANN model, and denotes the forecast errors from the AR benchmark model or the best combined forecasts of DFM and ANN. The statistic follows a standard normal distribution asymptotically. Note, a significant negative value of means that the FAANN model outperforms the other model in out-of-sample forecasting. Table 7 shows the result of the Diebold and Mariano test between the FAANN and the AR benchmark and between the FAANN and the best combined forecasts of DFM and ANN. The test results confirm that the FAANN models provide the lowest RMSEs. In summary the FAANN models provide significantly better forecasts at the 5% and 10% level compared to the AR and the best combined forecasts of DFM and ANN models.
|3 months||6 months||12 months|
FAANN vs. AR
FAANN vs. best combined forecast from DFM and ANN
|Share prices for gold mining|
FAANN vs. AR
FAANN vs. best combined forecast from DFM and ANN
|Long-term interest rate|
FAANN vs. AR
FAANN vs. best combined forecast from DFM and ANN
In this chapter we aim to evaluate the forecasting performance of the model combination and forecast combination for the ANN and DFM models. In the model combination, we merge the factors that were extracted from a large dataset—288 series in our case—with ANN which produces the FAANN model. For the forecast combination, we used different linear and nonlinear combination methods to combine the individual forecasts of the DFM and the ANN models. Using the period of January 1992 to December 2006 as in-sample period and January 2007 to December 2011 as out-of-sample period, we compare the forecast performance of the FAANN with DFM, ANN, and AR benchmark model for 3-, 6-, and 12-month-ahead forecast horizons for three variables, namely, for deposit rate, gold mining share prices, and long-term interest rate. The study has provided evidence using both the RMSE and Diebold and Mariano test as the comparison criteria that FAANN models best fit the three considered variables over the 3-, 6-, and 12-month-ahead forecast horizons.
Tables 2 – 4 show the ability of the model combination—FAANN model—to produce accurate forecast that outperforms DFM and ANN and their best forecast combination results. The results seem not contradicted with in-sample model forecast performance as in Table 1 . The FAANN model outperformed the AR benchmark model with large reduction in RMSE of around 25–46% considering all variables and forecast horizons. Compared to the DFM and ANN models, the FAANN model produces more accurate forecasts that yielded a decrease in RMSE of around 6–43% and 5–40%, respectively. We attribute the superiority of the FAANN to the flexibility of ANN to account for potentially complex nonlinear relationships that are not easily captured by linear models and the contribution of the factors to the model. On the other hand, the ANN and the DFM outperformed the AR benchmark with a reduction in the RMSEs of around 1–17% and 2–10%, respectively, for all variables and across all forecast horizons. Table 6 shows comparison results of the forecasting performance of the combined models—the FAANN—and the best forecast combination of the DFM and the ANN models. The results indicate that the combined models or information produced forecasts that are better than the best combined forecasts of the DFM and the ANN models. In other words, the nonlinear model that uses large dataset of economic and financial variables in addition to the lags of the interested variable improves the forecasting performance over models that are estimated separately—the DFM and the ANN. We also observed that the FAANN residual decreases as the forecast horizon increases.
- For further technical details on this type of factor models, see .
- The data sources are the South Africa Reserve Bank, ABSA Bank, Statistics South Africa, National Association of Automobile Manufacturers of South Africa (NAAMSA), South African Revenue Service (SARS), Quantec, and World Bank.
- The RMSE statistic can be defined as 1 N ∑ Y t + n − t Y ̂ t + n 2 , where Yt + n denotes the actual value of a specific variable in period t + n and t Y ̂ t + n is the forecast made in period t for t + n .