## 1. Introduction

Many papers deal with the departures from normality of asset return distributions. It is well known that the distributions of stock return exhibit negative skewness and excess kurtosis; see among others [2, 9, 14, 15]. The higher moments of the return specifically, excess kurtosis (the fourth moment of the distribution) makes extreme observations more likely than in the normal case, which means that the market gives higher probability to extreme observations than in normal distribution. However, the existence of negative skewness (the third moment of the distribution) has the effect of accentuating the left-hand side of the distribution, which means that a higher probability of decreases given to asset pricing than increases in the market.

The generalized autoregressive conditional heteroscedasticity (GARCH) models, introduced by Engle [5] and Bollerslev [1], allow for time-varying volatility[1] - but not for time-varying skewness or time-varying kurtosis. Different GARCH models have been developed in the literature to capture dependencies in higher order moments, starting with Hansen [7] who proposed a skew-Student distribution to account for both time-varying excess kurtosis and skewness. A significant evidence of time-varying skewness found [9]. Others [11, 12] found a significant time varying in both skewness and kurtosis, while [3, 15, 16] found little evidence of either. With regard to the frequency of observation, Jondeau and Rockinger [11] found the presence of time-varying skewness and kurtosis in daily but not weekly data, while others including [2, 7, 9] found an evidence of time-varying skewness and kurtosis in weekly and even monthly data. Regarding daily data [4, 12, 18] found an evidence of time-varying skewness and kurtosis in daily data. The chapter employed GARCH(1,1) model as the performance of the model proved compared large number of volatility models; for more details, see Hansen and Lunde [8].

This paper contributes to the literature of volatility modeling in two aspects. First, we jointly estimate time-varying volatility, skewness, and kurtosis assuming Johnson SU distribution for the error term. The method is applied to two different daily returns: stock indices and exchange rates. Second, a new alternative scheme is introduced to generate the sequence of the forecasts.

The rest of the paper is organized as follows. Following this introduction, Section 2 presents the empirical results regarding the estimation of the model. Section 3 compares the models. In Section 4, the new forecasting scheme is presented, while Section 5 gives concluding remarks.

## 2. Empirical results and methodology

### 2.1. Data and preliminary findings

The time series data used for modeling volatility in this paper consists of two sets of financial data. The first set includes daily returns of five stock indices: NASDAQ100 (US), Germany (DAX30), Ishares MSCI South Africa index (EZA), Shanghai stock exchange composite index (SSE), and Ishares MSCI Canada index (EWC).[2] -
The second data set includes daily returns of five exchange rates series: British Pound (USD/GBP), Australian Dollar (USD/AUD), Italian Lira (USD/ITL), South Africa Rand (USD/ZAR), and Brazilian Real (USD/BRL).[3] -
The two data sets include daily closing prices from August 6, 2001, through December 10, 2013, for all stock indices and from July 1, 2005, to September 17, 2013, for all exchange rate series with a total of 3001 observations for each data set. The estimation process for the two sets of data was run using 2001 observations as in-sample, while the remaining 1000 observations were used for the out-of-sample forecast. Based on the empirical evidence, it is common to assume that the logarithmic return series *r*_{t} = 100 * [ln(*p*_{t}) − ln(*p*_{t − 1})] (where *P*_{t} and *P*_{t−1} are the price at the current day and previous day, respectively) is weakly stationary. **Table 1** reports the descriptive statistics for all return series. It shows that all data exhibit excess kurtosis (leptokurtosis) and skewness, which represents the nature of departure from normality. The Jarque-Bera (JB) statistics for normality test show that the null hypotheses of normality are strongly rejected for all daily returns of stock and exchange rate series.

Assets | N | Mean | S.D. | Skewness | Kurtosis | Jarque-Bera |
---|---|---|---|---|---|---|

Stock indices | ||||||

NASDAQ100 | 2000 | 0.011 | 1.789 | 0.084 | 7.139 | 1429.85^{*} |

DAX30 | 2000 | 0.032 | 1.795 | 0.053 | 6.473 | 1929.78^{*} |

SSE | 2000 | 0.048 | 1.764 | −0.078 | 6.929 | 1292.92^{*} |

EZA | 2000 | 0.076 | 2.403 | −0.354 | 14.436 | 10968.85^{*} |

EWC | 2000 | 0.049 | 1.673 | −0.473 | 9.327 | 3420.18^{*} |

Exchange rates | ||||||

USD/GBP | 2000 | 0.007 | 0.485 | 0.658 | 11.419 | 6066.76^{*} |

USD/AUD | 2000 | −0.013 | 0.702 | 0.481 | 14.254 | 10659.08^{*} |

USD/ITL | 2000 | −0.004 | 0.467 | −0.197 | 8.185 | 2260.57^{*} |

USD/ZAR | 2000 | 0.001 | 0.877 | 1.010 | 17.404 | 17672.41^{*} |

USD/BRL | 2000 | −0.016 | 0.961 | 0.441 | 10.048 | 4215.97^{*} |

### 2.2. Methodology

Preliminary results in the preceding section provided evidence of a significant deviation from normality and obvious leptokurtosis in all daily return series. This suggests specifying GARCH models that capture these characteristics. In presenting these models, there are two distinct equations or specifications, one for the conditional mean and the other for the conditional variance. For the models employed in this paper, the mean equation for all stock return series is the AR(1) model with a constant, and for all exchange rate return series, we used the MA(1) model without a constant. After estimating the mean equation, the next step was to identify whether there is substantial evidence of heteroscedasticity for the daily returns of stock and exchange rate series. **Table 2** provides the Ljung-Box statistics of order 20 for
*ε*_{t} is the error term from the mean equation. The results show that the Ljung-Box statistics on the squared residuals

Note. For Ljung-Box statistics, the *p*-values are reported in parentheses.

#### 2.2.1. Distributional assumptions

To complete the basic GARCH specification, an assumption about the conditional distribution of the error term *ε*_{t} is required. The expectation is that the excess kurtosis and skewness displayed by the residuals of conditional heteroscedastic models will be reduced, when a more appropriate distribution is used. The Johnson’s SU distribution is resorted to in this study. This distribution has two shape parameters that allow a wide range of skewness and kurtosis levels of the type anticipated, and it is used in financial returns data [4, 18]. The Johnson’s SU distribution was derived by Johnson [10] through transformation of a normal variable. Letting z ~ N(0,1) the standard normal distribution, the random variable *y* defined by the transformation:

where *sinh*^{−1} is the inverse hyperbolic sine function defines a Johnson’s SU variable. The form of the density of the Johnson’s SU distribution, which will be used for the estimation procedure, is that due to Yan [18]:

where *y* ∈ *R*, *φ* is the density function of *N*(0, 1), *ξ* and *λ* > 0 are location and scale parameters, respectively, while *γ*, *δ* > 0 can be interpreted as skewness and kurtosis parameters, respectively. The parameters are not the direct moments of the distribution. The first four moments, the mean, variance, third central moment, and fourth central moment, respectively, of the distribution according to Yan [18] are as follows:

The quantities Ω and ω in the moment formulas are Ω = γ/δ and ω = exp(δ−2). The skewness and kurtosis are jointly determined by the two shape parameters γ and δ. The standardized Johnson’s SU innovations exist when ξ = 0 and λ = 1, but the mean and the variance are not 0 and 1, respectively. These can be done by setting the parameters in the following manner:

#### 2.2.2. Maximum likelihood

Under the presence of heteroscedasticity (autoregressive conditional heteroscedasticity (ARCH) effects) in the residuals of the daily returns of stock and exchange rate series, the ordinary least square estimation (OLS) is not efficient, and the estimate of covariance matrix of the parameters will be biased due to invalid ‘t’ statistics. Therefore, ARCH-type models cannot be estimated by simple techniques such as OLS. The method of maximum likelihood estimation is employed in ARCH models. For the formal exposition of the approach, each realization of the conditional variance *h*_{t} has the joint likelihood of realization:

The log likelihood function is:

The parameter values are selected so that the log likelihood function is maximized using a search algorithm by computers.

#### 2.2.3. Model estimation with time-varying volatility, skewness, and kurtosis

As it was shown in Section 2.2, when the residuals were examined for heteroscedasticity, the Ljung Box test provided strong evidence of ARCH effects in the residuals series, which suggests proceeds with modeling the returns volatility using the GARCH methodology. The model to be estimated in this study is the standard GARCH(1, 1) model with constant shape parameters, and also, we impose dynamics on both shape parameters to obtain autoregressive conditional density (ARCD) models.[5] - This allows for time-varying skewness and kurtosis assuming Johnson Su distribution for the error term in the two cases. Before presenting the estimation results obtained with both the stock return series and the exchange rate return series, the four nested models to be estimated are summarized as follows:

For stock return series:

Mean equation

Variance equation (GARCH)

Skewness equation

Kurtosis equation

For all stock return series, the study is going to use GARCH(1,1) model with a similar specification to that of Hansen [7] for shape parameters (*γ*_{t}*, δ*_{t}) but employs the standardized innovation *z*_{t−1} instead of nonstandardized *ε*_{t−1} as in Eqs. (13) and (14).

For exchange rate return series:

Mean equation

Variance equation (GARCH)

Skewness equation

Kurtosis equation

For the exchange rate return series, a specification similar to that of [11] for shape parameters (*γ*_{t}*, δ*_{t}) is used with the exception that it utilizes the standardized innovation *z*_{t−1} instead of nonstandardized *ε*_{t−1} as in Eqs. (17) and (18). It also considers the absolute standardized shocks for the shape parameter in Eq. (18), Ghalanos [6]. So, first, we start by estimating the two standard models for the conditional variance: the AR(1)-GARCH(1,1) model (Eqs. (11) and (12)) for the stock return series and MA(1)-GARCH(1,1) model (Eqs. (15) and (16)) for the exchange rate return series. Second, the generalizations of both the standard GARCH and GARCH models with time-varying skewness and kurtosis (GARCHSK) as in Eqs. (11)–(14) for the stock return series and Eqs. (15)–(18) for the exchange rate return series are estimated.

The results for the stock return series are presented in **Tables 3** and **4** for both the standard GARCH and GARCHSK models, respectively. As expected, the results indicate high and significant presence of conditional variance, since the coefficient of lagged conditional variance (*b*_{2}) is high, positive, and significant. Volatility is found to be persistent, since the coefficient of lagged volatility (*b*_{1}) is positive and significant, indicating that high conditional variance is followed by high conditional variance. The sum of the two estimated coefficients (*b*_{1} + *b*_{2}) in the estimation process is very close to one, implying that large changes in stock returns tend to be followed by large changes, and small changes tend to be followed by small changes. This confirms that volatility clustering is observed in the stock returns series. For the skewness and kurtosis equations, it is found that for all stock return series, days with high conditional skewness and kurtosis are followed by days with high conditional skewness and kurtosis except DAX30 in kurtosis case, since the coefficients for lagged skewness (c_{3}) and for lagged kurtosis (d_{3}) are positive and significant. In summary, there is a significant presence of conditional skewness and kurtosis for all stock return series, since at least one of the coefficients associated with the standardized shocks or squared standardized shocks to (skewness and kurtosis) or to lagged (skewness and kurtosis) is found to be significant.

Parameters | NASDAQ100 | DAX30 | SSE | EZA | EWC | |
---|---|---|---|---|---|---|

Mean equation | μ | 0.0536^{*} | 0.0940^{*} | 0.0207 | 0.1535^{*} | 0.0976^{*} |

φ | −0.0578^{*} | −0.0813^{*} | 0.0025 | −0.0534^{*} | −0.0461^{*} | |

Variance equation | b_{0} | 0.0082 | 0.0128^{*} | 0.0284^{*} | 0.0596^{*} | 0.0202^{*} |

b_{1} | 0.0499^{*} | 0.0646^{*} | 0.0756 | 0.1011^{*} | 0.0619^{*} | |

b_{2} | 0.9468^{*} | 0.9311^{*} | 0.9225^{*} | 0.8894^{*} | 0.9285^{*} | |

Log-likelihood | −3589.94 | −3588.5 | −3651.1 | −4178.55 | −3308.61 | |

AIC | 3.5969 | 3.5955 | 3.6580 | 4.1855 | 4.1445 | |

ARCH-LM test for heteroscedasticity | ||||||

Statistic (T*R^{2}) | 6.596 | 7.775 | 0.5993 | 1.385 | 4.032 | |

Prob. chi-square (5) | 0.2525 | 0.1691 | 0.9880 | 0.9259 | 0.5447 |

Parameters | NASDAQ100 | DAX30 | SSE | EZA | EWC | |
---|---|---|---|---|---|---|

Mean equation | μ | 0.0155 | 0.0816^{*} | 0.0555 | 0.1312^{*} | 0.0851^{*} |

φ | −0.0567^{*} | −0.0947^{*} | −0.0154 | −0.0512^{*} | −0.0540^{*} | |

Variance equation | b_{0} | 0.0104^{*} | 0.0167^{*} | 0.0506^{*} | 0.0620^{*} | 0.0250^{*} |

b_{1} | 0.0578^{*} | 0.0717^{*} | 0.1009^{*} | 0.0931^{*} | 0.0762^{*} | |

b_{2} | 0.9436^{*} | 0.9239^{*} | 0.8997^{*} | 0.8998^{*} | 0.9183^{*} | |

Skewness equation | c_{0} | −0.0038^{*} | 0.0035^{*} | 0.0015^{*} | −0.0261^{*} | −0.0256^{*} |

c_{1} | 0.00002 | −0.0083^{*} | −0.0054^{*} | 0.0838^{*} | 0.0163 | |

c_{2} | 0.00355^{*} | −0.0037^{*} | −0.0017^{*} | 0.0004 | 0.0192^{*} | |

c_{3} | 0.9939^{*} | 1.0000^{*} | 0.9898^{*} | 0.8661^{*} | 0.9165^{*} | |

Kurtosis equation | d_{0} | 0.0001 | 0.7193^{*} | 0.9625^{*} | 0.2245^{*} | 0.4362 |

d_{1} | 0.9869^{*} | 0.3126^{*} | 0.2684^{*} | 0.4848^{*} | 0.5166^{*} | |

d_{2} | 0.0799 | 0.2929^{*} | 0.0591 | 0.0000 | 0.2638^{*} | |

d_{3} | 0.8459^{*} | 0.0019 | 0.5469^{*} | 0.8143^{*} | 0.4358^{*} | |

Log-likelihood | −3559.79 | −3578.15 | −3620.83 | −3294.5 | −3406.96 | |

AIC | 3.5728 | 3.5911 | 3.6338 | 4.1344 | 3.4200 | |

ARCH-LM test for heteroscedasticity | ||||||

Statistic (T*R^{2}) | 6.942 | 6.604 | 1.678 | 0.7606 | 5.393 | |

Prob. chi-square (5) | 0.2250 | 0.2518 | 0.8917 | 0.9795 | 0.3698 |

The results for the five exchange rates are presented in **Tables 5** and **6** for GARCH and GARCHSK models, respectively. As expected, the results are the same as in the case of stock return series, i.e., the results also indicate highest significant presence of conditional variance. Volatility is found to be persistent, and volatility clustering is also observed in exchange rate return series. A significant presence of conditional skewness and kurtosis for all exchange rate return series is confirmed, since at least one of the coefficients associated with the standardized shocks (either negative or positive) to (skewness & kurtosis) or to lagged (skewness & kurtosis) are found to be significant.

Parameters | USD/GBP | USD/AUD | USD/ITL | USD/ZAR | USD/BRL | |
---|---|---|---|---|---|---|

Mean equation | θ | 0.28470^{*} | 0.1886^{*} | 0.2495^{*} | 0.2619^{*} | 0.0945^{*} |

Variance equation | b_{0} | 0.0009^{*} | 0.0015^{*} | 0.0006 | 0.0165^{*} | 0.0114 |

b_{1} | 0.0384^{*} | 0.0485^{*} | 0.0331^{*} | 0.0553^{*} | 0.1041 | |

b_{2} | 0.9579^{*} | 0.9505^{*} | 0.9658^{*} | 0.9175^{*} | 0.8948^{*} | |

Log-likelihood | −907.732 | −1528.337 | −922.161 | −2257.187 | −2159.827 | |

AIC | 0.9137 | 1.5343 | 0.9282 | 2.2632 | 2.1658 | |

ARCH-LM test for heteroscedasticity | ||||||

Statistic (T*R^{2}) | 5.169 | 2.900 | 4.019 | 9.646 | 28.35 | |

Prob. chi-square (5) | 0.0754^{**} | 0.7155 | 0.1340^{**} | 0.0859 | 0.0016 |

Parameters | USD/GBP | USD/AUD | USD/ITL | USD/ZAR | USD/BRL | |
---|---|---|---|---|---|---|

Mean equation | θ | 0.2978^{*} | 0.2111^{*} | 0.2626^{*} | 0.2590^{*} | 0.0978^{*} |

Variance equation | b_{0} | 0.0009 | 0.0016 | 0.0006 | 0.0139^{*} | 0.0086^{*} |

b_{1} | 0.0502^{*} | 0.0597^{*} | 0.0425^{*} | 0.0760^{*} | 0.2626^{*} | |

b_{2} | 0.9489^{*} | 0.9449^{*} | 0.9582^{*} | 0.9119^{*} | 0.8348^{*} | |

Skewness equation | c_{0} | −0.0306 | 0.0368^{*} | −0.0189 | 0.0168^{*} | −0.0047 |

c_{1} | 0.0237 | 0.0610^{*} | 0.0195 | 0.0589^{*} | −0.0051 | |

c_{2} | 0.0808^{*} | 0.0036 | 0.0658^{*} | 0.0058 | 0.0150^{*} | |

c_{3} | 0.0000 | 0.4814 | 0.0000 | 0.9018^{*} | 0.8807^{*} | |

Kurtosis equation | d_{0} | 0.2075 | 0.2939^{*} | 0.2128 | 0.4497 | 0.0405 |

d_{1} | 0.4029^{*} | 0.5678^{*} | 0.3459^{*} | 1.0000^{*} | 1.0000^{*} | |

d_{2} | 0.0050 | 0.0000 | 0.0235 | 0.0000 | 0.0000 | |

d_{3} | 0.8217^{*} | 0.7851^{*} | 0.8364^{*} | 0.5342^{*} | 0.9077^{*} | |

Log-likelihood | −895.695 | −1516.323 | −910.919 | −2227.667 | −2135.46 | |

AIC | 0.9077 | 1.5283 | 0.9229 | 2.2397 | 2.1475 | |

ARCH-LM test for heteroscedasticity | ||||||

Statistic (T*R^{2}) | 4.299 | 2.4075 | 3.308 | 8.659 | 9.116 | |

Prob. chi-square (5) | 0.1165 | 0.7904 | 0.1912^{**} | 0.1235 | 0.1045 |

Finally, it is worth noting that from the bottom of **Tables 3**–**6**, the value of Akaike information criterion (AIC) decreases monotonically when moving from the simpler model (standard GARCH) to the more complicated ones (GARCHSK) for all return series. Therefore, for all return series analyzed, the GARCHSK model specification seems to be the most appropriate one according to the AIC. Note that the ARCH-LM test statistics for all return series did not exhibit additional ARCH effect. This shows that the variance equations are well specified and adequate.

## 3. Comparison of models

One way to start comparing the models is to compute the likelihood ratio test. The LR test statistic has been used to compare the standard GARCH model (restricted model) and GARCHSK model (unrestricted model), where Johnson Su distribution is assumed for the standardized error *z*_{t} in both specifications. The results are contained in **Table 7**. The value of the *LR* statistic is quite large in all return series. This means that the GARCHSK model is showing superior performance than the standard GARCH model with constant shape parameters.

Series | LogL (GARCH) | LogL (GARCHSK) | LR |
---|---|---|---|

Stocks | |||

NASDAQ100 | −3589.94 | −3559.79 | 60.3^{*} |

DAX30 | −3588.5 | −3578.15 | 20.7^{*} |

SSE | −3651.1 | −3620.83 | 60.54^{*} |

EZA | −3308.61 | −3294.5 | 28.22^{*} |

EWC | −3415.2 | −3406.96 | 16.48^{*} |

Exchange rates | |||

USD/GBP | −907.732 | −895.695 | 24.07^{*} |

USD/AUD | −1528.337 | −1516.323 | 24.03^{*} |

USD/ITL | −922.161 | −910.919 | 22.48^{*} |

USD/ZAR | −2257.187 | −2227.667 | 59.04^{*} |

USD/BRL | −2159.827 | −2135.46 | 48.73^{*} |

## 4. A new forecast scheme

In the literature, three alternative ways for generating the sequence of the forecasts, namely the recursive, rolling, and fixed schemes are suggested, see [13]. In this paper, the estimation sample of the models for all return series is based on *R* = 2000 observations, while the last *P* = 1000 observations are used for the out-of-sample forecast. Only the case of generating one-step ahead forecasts using the three alternative methods to generate a sequence of *P* one-step ahead forecasts is considered. For the estimation sample sizes *R* for all return series, the study will consider five different values for *P* for the three alternative schemes, namely *P* = 200, 400, 600, 800, 1000.

In this section, an attempt is made to introduce a new alternative scheme to generate the sequence of the forecasts by computing a weighted average of the last three alternative methods. The weights used are the reciprocals of the MSE of the methods. The rationale behind this is that a method with large mean square forecasting errors (MSE) (i.e., less reliability) should be given a smaller weight. The suggested name for the new method is “weighted average scheme.” The four forecasting alternative schemes are applied using the estimated GARCHSK models for stock and exchange rate return series, which are given in the previous section and the results are shown in **Table 8**.

**Table 8** presents the averages of the mean square forecasting errors over all levels of out-of-sample forecast (*P* = 200, 400, 600, 800, 1000) for the recursive, rolling, fixed, and weighted average schemes for all daily returns of stock and exchange rate series. The results show that the average forecasting mean squares errors for the four forecasting methods for all return series differ only either in the second decimal place or third decimal place. Although the weighted method shows clear superiority to the recursive and fixed methods, it failed to beat the rolling method which outperforms all other three methods in these data. We attribute the fair performance of weighted method compared to the rolling method possibly because of the small differences in the mean square errors of the un-weighted methods. We expect it to perform better in cases, where the three methods differ markedly with respect to their mean square errors.

## 5. Conclusions

This chapter proposes a GARCH-type model that allowing for time-varying volatility, skewness, and kurtosis where assuming a Johnson’s SU distribution for the error term. Models estimated using daily returns of five stock indices and five exchange rate series. The results indicate significant presence of conditional volatility, skewness, and kurtosis. Moreover, it is found that specifications allowing for time-varying skewness and kurtosis outperform specifications with constant third and fourth moments. Also, a weighted average forecasting scheme is introduced to generate the sequence of the forecasts by computing a weighted average of the three alternative methods namely the recursive, rolling, and fixed schemes are suggested. The results showed that the weighted average scheme did not show clear superiority to the other three methods. Further work will consider linear and nonlinear combining methods and different forecasting horizons to forecast stock and return series.