 Open access peer-reviewed chapter

# Modeling Nonlinear Vector Time Series Data

Written By

Jiancheng Jiang and Sha Yu

Submitted: 16 April 2017 Reviewed: 05 September 2017 Published: 20 December 2017

DOI: 10.5772/intechopen.70825

From the Edited Volume

## Time Series Analysis and Applications

Edited by Nawaz Mohamudally

Chapter metrics overview

View Full Metrics

## Abstract

In this chapter, we review nonlinear models for vector time series data and develop new nonparametric estimation and inference for them. Vector time series data exist widely in practice. In financial markets, multiple time series are usually correlated. When analyzing several interdependent time series, in general one should consider them as a single vector time series fitted by multivariate models, which provides a useful tool for modeling interdependencies among multiple time series and for simultaneously analyzing feedback and Granger causality effects. Since nonlinear features are widely observed in time series, we consider nonlinear methodology for modeling nonlinear vector time series data, which allows flexibility in the model structure and avoids the curse of dimensionality.

### Keywords

• cointegration
• VAR
• multivariate threshold autoregressive model
• nonparametric smoothing
• generalized likelihood ratio

## 1. Introduction

Multiple time series are of considerable interest in an array of domains, such as finance, economics, engineering and so on. The data are collected in time order and consist of several related variables of interest, for instance, the data of stock price indexes and the status data of important instruments such as shuttles. It is of much practical significance to model this kind of data well. Moreover, a lot of commonly seen multiple time series are correlated, which makes it reasonable to regard them as a single vector and to fit them using multivariate models. Multivariate models perform well in exploring the interdependencies among multiple time series and capturing the dynamic structure.

Plenty of contributions have been made in the field of parametric models for multivariate time series. For instance, Sims proposed vector autoregressive (VAR) models in 1980 , Engle and Kroner considered multivariate generalized autoregressive conditional heteroscedastic (GARCH) models in 1995 , and Tsay developed the multivariate threshold models in 1998 . Compared to parametric models, nonparametric models require less assumption about the model structure and are more flexible. Combined with the fact that nonlinearity widely exists in time series, it is ideal to model the multiple time series using nonparametric models. However, not much of achievements have been made about this. This is partly due to the complexity of nonparametric smoothing as well as the curse of dimensionality. With these objectives in mind, Jiang proposed the multivariate functional-coefficient model in 2014 , which provides a useful tool for modeling vector time series data.

In this chapter, we first review some vector time series models, next extend them to include an error-correction term by incorporating cointegration among integrated variables, then develop a single index model for choosing the smoothing variable and a variable selection method for the multivariate functional-coefficient models, and finally study multivariate time-varying coefficient models and related hypothesis testing problems.

The remainder of this chapter is organized as follows. In Section 2 we review vector autoregressive (VAR) models. In Section 3, we consider multivariate functional-coefficient regression models and their extensions, where a model selection rule is also proposed. In Section 4 we introduce multivariate time-varying coefficient models and propose a generalized likelihood ratio test. In Section 5 we make a conclusion and discuss some interesting research topics to be completed.

## 2. Review of VAR models

The vector autoregressive model is a generalization of the univariate autoregressive model for forecasting a vector of time series. This model was pioneered by Sims in Ref.  and it has acquired a prominent role in analyzing macroeconomic time series. Prior to 1980, large-scale statistical dynamic simultaneous equations model (DSEMs) was widely used in empirical macroeconomics, which often contained dozens or even hundreds of equations. As the economic environment has grown more complicated, the traditional simultaneous models have grown. Sims believed that since these models do not dichotomize variables into “endogenous” and “exogenous,” the exclusion restrictions used to identify the simultaneous equations models make little sense. Thus, he advocated the vector autoregressive model (VAR) to model the interrelationships among a set of macroeconomic variables. In the structure of VAR models, each variable is a linear function of past lags of itself and past lags of the other variables. Sims demonstrated that VARs provide a flexible and tractable framework for analyzing economic time series. While hardly relying on economic theorems, VAR models have proven efficient in capturing the dynamics of multivariate systems as well as forecasting . Specifically, a vector autoregressive model of order p [VAR(p)] has the following general form:

y t = c + A 1 y t 1 + A p y t p + e t E1

where yt = (y1t,  … , yKt) is a set of K time series variables, c is a K × 1 vector of constant, Ai’s are K × K coefficient matrices, and et are error terms. Usually, et are assumed to be zero-mean independent white noise with time-invariant and positive-definite covariance matrix Σ. For example, a VAR (1) model with two time series components can be written as:

y 1 t y 2 t = c 1 c 2 + A 11 A 12 A 21 A 22 y 1 , t 1 y 2 , t 1 + e 1 , t e 2 , t

or the equation set

y 1 t = c 1 + A 11 y 1 , t 1 + A 12 y 2 , t 1 + e 1 , t y 2 t = c 2 + A 21 y 1 , t 1 + A 22 y 2 , t 1 + e 2 , t

Using lag-operator L, Eq. (1) can be written as the following form:

y t = c + A 1 L + A 2 L 2 + + A p L p y t + e t E2

Let A(z) = I − A1z − A2z2 −  … Apzp, where z is a complex number. Then the VAR process is stable if

det A z 0 for z 1 . E3

In other words, the determinant of the matrix polynomial has no roots in and on the complex unit circle. If the stability conditions are satisfied and the process can be extended to the infinite past, then the VAR process is stationary.

For model (1), since the right-hand side consists of only predetermined variables and the error terms are assumed to be independent white noise with time-invariant covariance, each equation can be estimated by ordinary least squares (OLS). Zellner proved that the OLS estimator coincides with the generalized LS (GLS) estimator .

The celebrated model (1) is easy to fit, and its autoregressive structure allows one to study the feedback effects and the Granger causality. However, model (1) employs only the lagged values of yt for forecast and ignores other potentially important variables’ effect. In addition, as time evolved, the coefficients remain constant, which may contrast the real situations where the dynamic structure of the relationship among different time series involves with time.

## 3. Multivariate functional-coefficient regression models and extensions

We briefly reviewed VAR models in the previous section. This parametric method has been significantly developed and widely applied to econometric dynamics as well as other domains. An alternative to modeling vector time series is the nonparametric method, which requires much fewer assumptions on the model structure and may shed light on the later parametric fitting. To illustrate the basic idea of this approach, let us begin with the multivariate threshold autoregressive model .

### 3.1. Multivariate threshold autoregressive model

The multivariate threshold autoregressive model is a generalization of the univariate threshold autoregressive model . The idea is to partition one-dimensional variable into s regimes and impose an AR model with exogenous variables in each regime. Consider a k− dimensional time series yt = (y1t,  … , ykt) and a v-dimensional exogenous variable xt = (x1t,  … , xvt), for t = 1,…,n. Let − ∞  = r0 < r1 <  ⋯  < rs = ∞. The multivariate threshold model with threshold variable zt and delay d has the following form:

y t = c j + i = 1 n φ i j y t i + i = 1 q β i j x t i + ε i j if r j 1 < z t d r j j = 1 s , E4

where p and q are nonnegative integers and ε t j = Σ j 1 2 a t , with Σ j 1 2 being a positive-definite matrix and a t a sequence of serially uncorrelated random vectors with mean zero and covariance matrix Ιk. The threshold variable zt is assumed to be stationary and has a continuous distribution.

Model (4) is piecewise linear in the threshold space of zt − d, but it is nonlinear when s > 1 . This model has proven to be useful in practice. Nevertheless, the assumption embedded in this model weakens the practicability, that is, the coefficients are assumed to be constants in the threshold space of zt − d in model (4). This assumption is questionable since the economic conditions tend to change slowly over time and the coefficient functions may vary smoothly. Motivated by this, Jiang proposed the multivariate functional-coefficient model, in which the coefficients are functions of threshold variable zt − d instead of constants .

### 3.2. Multivariate functional-coefficient models

The multivariate functional-coefficient model has the following form:

y t = c z t d + i = 1 p φ i z t d y t i + i = 1 q β i z t d x t i + ε t , E5

where c ( ) is a k × 1 functional vector, φi(⋅) are k × k functional matrices, and βi(⋅) are k × v functional matrices. The innovation satisfies ε t = σ t a t , where σt is a positive-definite matrix and a t as in Eq. (4). Assume that σt is measurable with respect to the σ-field generated by the historical information F t − 1 = {(wj,zj − d) : j ≤ t}, where wj = (xj − 1,  … , xj − q, yj − 1,  … , yj − p). For model (5), we are interested in estimating the regression part. Once it is estimated, one may consider making simultaneous inference about parameters and using the residuals to study the structure of the volatility matrix. This model is a generalization of vector autoregressive models , threshold models  and functional-coefficient models . Even for one-dimensional settings with k = 1, model (5) includes important predictive regression models in econometrics, such as the linear predictive models with nonstationary predictors  and functional-coefficient models for nonstationary time series data . Model (5) can also be used to investigate the Granger Causality  and the feedback effect in engineering and finance [18, 19].

For model (5), a weighted local least squares estimation method was provided in . Let Xt = vec(1, yt − 1,  … , yt − p, xt − 1,  … , xt − p) and Φ(z) = (c(z), φ1(z),  ⋯ , φp(z), β1(z),  … , βq(z)). Then model (5) becomes

y t = Φ z t d X t + ε t , E6

where Φ(⋅) is a k × m matrix-valued function and Xt is an m × 1 vector with m = 1 + pk + qv. For any zt − d in the neighborhood of z, by the Taylor expansion, we have

Φ z t d Φ z + Φ z z t d z A + B z t d z .

Let S and V be 2 × 2 matrices whose (i, j)th elements are μi + j − 2 =  ∫ ui + j − 2K(u)du and νi + j − 2 =  ∫ ui + j − 2K2(u)du, respectively, and let s = (μ2, μ3). Given any invertible working variance matrix σt2 of σt∗2, the estimator A ˜ B ˜ is achieved by minimizing

t = s + 1 n σ t 1 y t AX t BX t z t d z 2 K h n z t d z ,

where ∥ ⋅ ∥ denotes the Euclidean norm, s = max(p, d, q), and Khn(x) = hn−1K(x/hn) for kernel function K(⋅) with bandwidth hn controlling the amount of smoothing. Let Khn(i)(zt − d − z) = hni(zt − d − z)Khn(zd − z) and S ˜ ni = t = s + 1 n X t X t T σ t 2 K h n i z t d z for i = 0 , 1 , 2. Then the weighted estimators A ˜ B ˜ admit the closed form:

( vec ( A ˜ ) vec ( h n B ˜ ) ) = S ˜ n 0 S ˜ n 1 S ˜ n 1 S ˜ n 2 1 t = s + 1 n X t σ t 2 y t K h n z t d z t = s + 1 n X t σ t 2 y t K h n 1 z t d z .

Under certain conditions, the weighted estimators are asymptotically normal (see ).

Recall that, in model (5), σt is a positive-definite matrix measurable with respect to the sigma-algebra generated by historical information. If there is a parametric structure of σt, for example, the generalized autoregressive conditional heteroscedastic (GARCH) errors , then it helps to improve the efficiency of the weighted estimation. Example 3 in  exemplifies this point. Our intuition is that, if a parametric structure of σt is correctly specified, then the weighted estimation mimics the oracle estimation in the sense that σt is known. This intuition can be verified theoretically since σt can be estimated at rate of n which is faster than what we can do for the regression function in model (5).

### 3.3. Extension of multivariate functional-coefficient models

Due to the fact that many economic factors are not stationary, classic regression analysis requiring the stationarity condition suffers from a great limitation. Cointegration analysis has become a formidable toolkit in analyzing non-stationary economic time series. The concept of cointegration goes back to Granger  and initiated a literal research boom. Engle & Granger proposed the well-known Engel-Granger test to examine whether there is a cointegrating relationship among a set of first-order integrated variables .

Motivated by Granger and Engel & Granger, Jiang proposed an error-correction version of model (5) by incorporating the cointegrating relationship of first-order integrated variables . This allows us to cope with the nonstationarity of vector time series and to improve the accuracy of forecasting.

Let st denote a k × 1 vector of first-order integrated variables and let yt = st − st − 1. Assume that there is a co-integrating relationship for st; that is, there exists a unique k × r(0 < r < k) deterministic matrix θ of rank r and a stationary process ut such that θTst = ut. Then an error-correction form of model (5) is

y t = c z t d + γ z t d u t 1 + i = 1 p φ i z t d y t i + i = 1 q β i z t d x t i + ε t , E7

where γ(zt − d) is a k × r coefficient matrix. This model simplifies to the Granger representation theorem if the coefficient functions are constant and there are no exogenous variables .

Due to the widespread presence of cointegrating variables in finance and economics, model (7) should improve the practicability of model (5). However, model (7) requires specification of variable zt. This can be relaxed by using the idea of single index models. Recall that model (5) can be represented in succinct form (6). The similar operations can be applied to model (7). Now set zt = γTXt and let data decide the value of γ. Then model (7) can be extended as

y t = Φ γ T X t d X t + ε t , E8

where γ is a directional vector such that its first nonzero entry is positive. Model (8) is more flexible than model (7), it is key to estimate γ. We introduce the profile lease squares method to estimate model (8). The estimation procedure consists of several steps:

Step 1. Given an initial value of γ, one obtains the weighted estimator Φ ̂ γ of coefficient function in the same way as for model (6).

Step 2. Find the value γ ̂ to minimize

t = s + 1 n y t Φ ̂ γ T X t d γ X t 2 . E9

Step 3. Update the value of γ by γ ̂ , and repeat Step 1 and Step 2 many times until convergence. The coefficient function Φ(⋅) is estimated by Φ ̂ γ ̂ .

It can be shown that Φ ̂ γ ̂ shares the same asymptotic normality as the Oracle weighted estimator in the sense that it knows the true value of γ, since γ ̂ is n -consistent.

### 3.4. Variable selection of multivariate functional-coefficient models

In this section, we consider variable selection of model (6). Increasing the lags p and q will necessarily reduce the sum of squared errors. However, doing so will increase the burden of coefficient estimation and may also lead to overfitting. Hence, for the multivariate functional-coefficient model, order selection is of much importance.

Two widely used model selection criteria are Akaike Information Criterion (AIC) and Bayesian Information Criterion (BIC). However, these stepwise methods yield heavy burden on computation and furthermore bring difficulty in establishing asymptotics for the estimation of selected models. The problems become more severe for high-dimensional data. Various regularization methods have been proposed to deal with these problems. Among them, a popular approach, called LASSO, proposed by Tibshirani, performs variable selection and parameter estimation simultaneously. See Ref. . For univariate varying-coefficient regression models with i.i.d. data, Wang and Xia  developed a shrinkage estimation method by combining the idea of group LASSO  and kernel smoothing. In the following we develop a shrinkage estimation method for multivariate functional-coefficient model (6):

y t = Φ z t d X t + ε t , E9000

where the functional-coefficient matrix Φ(z) = (c(z), φ1(z),  … , φp(z), β1(z),  … , βq(z)). Since each column of Φ(⋅) corresponds to the effect of a component of Xt, for variable selection of Xt we should penalize each column of Φ(⋅) as a whole. This leads to minimizing

Q λ Φ = i = s + 1 n t = s + 1 n y t Φ z i d X t 2 K h 1 z t d z i d + j = 1 p + q + 1 λ j Φ j , E10

where Φj = (Φj(zs + 1 − d),  … , Φj(zn − d)) with Φj(⋅) being the jth column of Φ(⋅), λj’s are tuning parameters, and for any matrix A we use ∥A∥ to denote the Hilbert-Schmidt norm of matrix. It is interesting to establish model selection consistency and the oracle property of the shrinkage estimation.

## 4. Multivariate time-varying coefficient models

Parallel to functional-coefficient model (5), it is natural to consider its alternative with time-varying coefficients :

y t = c t / T + i = 1 p φ i t / T y t i + i = 1 q β i t / T x t i + ε t , t = 1 , , T , E11

where yt is a k × 1 vector, xt is a v × 1 vector, c is a k × 1 vector, ϕi(⋅) are k × k smooth matrices and βi(⋅) are k × v smooth matrices. The innovation satisfies the same conditions as model (5). It is known that as time involves the economic conditions change slowly and smoothly. Model (11) reflects this smoothing change by allowing the coefficients being smoothing functions of time.

Let

Φ t / T = c t / T φ 1 t / T φ p t / T β 1 t / T β q t / T .

Using similar arguments to model (6), we can rewrite model (11) as

y t = Φ t / T X t + ε t , t = 1 , , T , E12

where Φ(⋅) is a k × m matrix and Xt is the same as in model (6). By the Taylor expansion, for any t in the neighborhood of t0 ∈ (0, T), we have

Φ t / T Φ t 0 / T + Φ t 0 / T t t 0 / T P + Q t t 0 / T .

Running the local linear smoother for model (12), we minimize

t = s + 1 T y t PX t QX t t t 0 / T 2 K h t t 0 E13

over P and Q, where s = max(p, q) and Kh(x) = h−1K(x/hT). Then it is straightforward to obtain an explicit form of the minimizer, P ̂ Q ̂ , for the above optimization problem,

vec P ̂ vec h Q ̂ = S T 0 S T 1 S T 1 S T 2 1 t = s + 1 T X t I k y t K h t t 0 t = s + 1 T X t I k y t K h 1 t t 0 , E14

where S Ti = t = s + 1 T X t X t I k K h i t t 0 and Kh(i)(t − t0) = (Th)i(t − t0)iKh(t − t0), for i = 0 , 1 , 2.

Define M = E[(XtXtT) ⊗ Ik] and N = E[(XtXtT) ⊗ (σt)2]. Let μi =  ∫ uiK(u)du, vi =  ∫ uiK2(u)du,

U = μ 0 μ 1 μ 1 μ 2 , V = v 0 v 1 v 1 v 2

Using similar arguments to , we can show that this estimator is asymptotically normal with mean zero and variance Σ, where Σ = (U−1VU−1) ⊗ (M−1NM−1).

### 4.1. Generalized likelihood ratio tests

The multivariate time-varying coefficient regression model is flexible and powerful to estimate the dynamic changes of coefficients. After fitting a given dataset, some important questions arise, for example, whether the coefficient functions are actually constant or of some particular forms? This leads to statistical hypothesis testing. To answer these questions, we develop generalized likelihood ratio statistics to test corresponding hypothesis testing problems about the coefficient functions .

For the multivariate time-varying coefficient model (12), assume Σ0−1/2εt has mean zero and covariance matrix Ik with Σ0 being a symmetric positive-definite constant matrix.

Consider the following hypothesis testing problem

H 0 : Φ t / T Θ 0 t / T H a : Φ t / T Θ 0 t / T , E15

where Θ0(t/T) is some known constant matrix Φ0 or a set of functionalmatrices. Let Φ ̂ t / T denote the nonparametric estimator of Φ, and let Φ ̂ 0 t / T denote the true or estimated value of coefficients under the null hypothesis. Following Fan et al.  and Fan and Jiang , we define a generalized likelihood ratio statistic for testing problem (15):

where RSS 0 = t = 1 T y t Φ ̂ 0 t / T X t T Σ 1 y t Φ ̂ 0 t / T X t T , and RSS a = t = 1 T y t Φ ̂ t / T X t T Σ 1 y t Φ ̂ t / T X t with Σ being a known constant covariance matrix from a working model. It is meaningful to study the asymptotic distributions of the test statistic under the null and alternatives.

In the following example, we consider the case when Θ0(.) is a known constant. For any u = t/T ∈ (0, 1), if we rewrite matrix Φ(u) as a vector, Δ(u) ≡ vec(Φ1(u),  … , Φm(u)), and denote Δ0(u) ≡ vec(Φ01(u),  … , Φ0M(u)), then the power of the test is evaluated against alternatives:

H a : Δ u = Δ 0 u + 1 Th G u , E17

where G(u) = (g1(u),  … , gm(u))T is a vector of functions.

Example 1. To investigate the performance of the proposed generalized likelihood ratio test, 600 replications for each of sample sizes T = 200, T = 400 and T = 800 from the multivariate time-varying coefficient model were generated:

y t = Φ t / T X t + ε t , t = 1 , , T

where k = 2, v = p = q = 1, Δ = vec(0.5, 0.0074, 0.08, 0.65, 0.25, 0.75)T. We set the initial values x1 = 0 and y1 = (0.15, 0.2). Accordingly, Xt = vec(y1 , t − 1, y2 , t − 1, xt − 1) for t = 2 ,  …  , T. Three distributions of the error term are considered: bivariate normal, bivariate log-normal, and bivariate t(5), each with variance matrix Σ = 1 0.5 0.5 1 . According to alternative (17), the power of the test is evaluated for a sequence of alternatives index by θ:

H θ : Δ θ = 0.5 0.0075 0.08 0.65 0.25 0.75 T + θ Th G t / T , E18

where G t / T = sin 2 π / T 0.09 cos πt / T 0.16 sin 3 π / T 0.8 sin 2 π / T 0.3 sin t / T cos 1.5 πt / T T and θ = 0 , 0.2 , 0.4 , 0.6 , 0.8 , 1. The power function is estimated by the relative rejection frequency of H0 in the above replicates.

The significance level is set to be 5%, and the critical values in simulations are calculated similarly by using the conditional bootstrap method in Ref.  for each given θ value. Detail of this method is as follows:

Step 1. Compute the estimators of the coefficient Φ ̂ t / T under both the null and the alternative by setting the optimal bandwidth as the estimated value h ̂ opt .

Step 2. Compute the test statistic λT(H0) and the residuals {et} from the alternative model.

Step 3. For each given Xt, draw a bootstrap residual et from the centered empirical distribution of et and compute y t = Φ ̂ t / T X t + e t . This forms a conditional bootstrap sample X t y t t = 1 T .

Step 4. Compute the test statistic λT(H0) using the bootstrap sample constructed in Step 3.

Step 5. Repeat Step 3 and Step 4 to get a sample of the test statistic λT(H0). The critical values at significance level α are calculated by the 100(1 − α)th percentile of the sample.

Figure 1 displays the power curves in difference scenarios. We can tell from Figure 1 that the patterns of power curves look like half of an inverted normal density. All the curves rise monotonically from a height equal to the significance level of 5% until eventually it reaches its maximum height of around 90%. It is evident from Figure 1 that the test is powerful for all three different distributions of error terms. Moreover, the test becomes more powerful as sample size increases. These indicate that the proposed test keeps the size and is powerful for distinguishing the difference between the null and the alternative. Figure 1.The power curves for Example 1. Significance level is 5%.

## 5. Conclusions

In this chapter, we have reviewed some parametric and nonparametric methods for modeling nonlinear vector time series data, which include the VAR model, the multivariate threshold autoregressive model, and the multivariate functional-coefficient regression model. These models have great significance in econometrical and statistical theory and application. Based on the weighted local least square estimation, we have proposed a variable selection method for the functional-coefficient model. This model selection procedure is applicable to the proposed multivariate single index models and multivariate time-varying coefficient models. We have also extended the generalized likelihood ratio test to the time-varying coefficient model and demonstrated its performance through simulation. The proposed methodology is very useful for modeling nonlinear dynamic structures inherited in financial data. However, there are many problems remain unsolved for our procedure, such as the limiting theory about the proposed methodology. Future work includes, but not limited to, extending our models to nonstationary settings and exploring their performance in different applications.