InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Mathematics » "Kalman Filters - Theory for Advanced Applications", book edited by Ginalber Luiz de Oliveira Serra, ISBN 978-953-51-3828-0, Print ISBN 978-953-51-3827-3, Published: February 21, 2018 under CC BY 3.0 license. © The Author(s).

Chapter 9

Efficient Matrix-Free Ensemble Kalman Filter Implementations: Accounting for Localization

By Elias David Niño Ruiz, Rolando Beltrán Arrieta and Alfonso Manuel Mancilla Herrera
DOI: 10.5772/intechopen.72465

Article top

Efficient Matrix-Free Ensemble Kalman Filter Implementations: Accounting for Localization

Elias David Niño Ruiz, Rolando Beltrán Arrieta and Alfonso Manuel Mancilla Herrera
Show details


This chapter discusses efficient and practical matrix-free implementations of the ensemble Kalman filter (EnKF) in order to account for localization during the assimilation of observations. In the EnKF context, an ensemble of model realizations is utilized in order to estimate the moments of its underlying error distribution. Since ensemble members come at high computational costs (owing to current operational model resolutions) ensemble sizes are constrained by the hundreds while, typically, their error distributions range in the order of millions. This induces spurious correlations in estimates of prior error correlations when these are approximated via the ensemble covariance matrix. Localization methods are commonly utilized in order to counteract this effect. EnKF implementations in this context are based on a modified Cholesky decomposition. Different flavours of Cholesky-based filters are discussed in this chapter. Furthermore, the computational effort in all formulations is linear with regard to model resolutions. Experimental tests are performed making use of the Lorenz 96 model. The results reveal that, in terms of root-mean-square-errors, all formulations perform equivalently.

Keywords: ensemble Kalman filter, modified Cholesky decomposition, sampling methods

1. Introduction

The ensemble Kalman filter (EnKF) is a sequential Monte Carlo method for parameter and state estimation in highly nonlinear models. The popularity of the EnKF owes to its simple formulation and relatively easy implementation. In the EnKF, an ensemble of model realizations is utilized in order to, among other things, estimate the moments of the prior error distribution (forecast distribution). During this estimation and the assimilation of observations, many challenges are present in practice. Some of them are addressed in this chapter: the first one is related to the proper estimation of background error correlations, which has a high impact on the quality of the analysis corrections while the last one is related to the proper estimation of the posterior ensemble when the observation operator is nonlinear. In all cases, background error correlations are estimated based on the Bickel and Levina estimator [1].

2. Preliminaries

We want to estimate the state of a dynamical system xRn×1 , which (approximately) evolves according to some (imperfect) numerical model operators xk=Mtk1tkxk1 , for instance, a model which mimics the behavior of the atmosphere and/or the ocean where n is the model dimension (typically associated with the model resolution). The estimation process is based on two sources of information: a prior estimate of x , xb=x+ξ with ξN0nB , and a noisy observation y=Hx+ε with εN0mR , where m is the number of observed model components, Rm×m is the estimated data error covariance matrix, H:Rn×1Rm×1 is the (nonlinear) observation operator that maps the information from the model domain to the observation space, 0n is the nth dimensional vector whose components are all zeros. In practice, mn or m<n . The assimilation process can be performed by using sequential data assimilation methods, and one of the most widely used methods is the ensemble Kalman filter, which is detailed in the following paragraphs.

2.1. The ensemble Kalman filter

In the ensemble Kalman filter (EnKF) [2], an ensemble of model realizations


is utilized in order to estimate the moments of the background error distribution,

via the empirical moments of the ensemble (1) and therefore,




where N is the ensemble size, xbiRn×1 is the ith ensemble member, for 1iN , xbRn×1 is well known as the background state, BRn×n stands for the background error covariance matrix, x¯b is the ensemble mean, and Pb is the ensemble covariance matrix. Likewise, the matrix of member deviations ΔXbRn×N reads,


A variety of EnKF implementations have been proposed in the current literature, most of them rely on basic assumptions over the moments of two probability distributions: the background (prior) and the analysis (posterior) distributions. In most of EnKF formulations, normal assumptions are done over both error distributions.

In general, depending on how the assimilation process is performed, the EnKF implementation will fall in one of two categories: stochastic filter [3] or deterministic filter [4]. In the context of stochastic filters, for instance, the assimilation of observations can be performed by using any of the next formulations,




where the analysis covariance matrix ARn×n is given by A=Pb1+HTR1H1, the matrix of innovations (on observations) is denoted by ΔY=YsHXbRn×N, and the matrix of perturbed observations reads Ys=y1NT+ERm×N , where the columns of ERm×N are formed by samples from a normal distribution with moments N0R . In practice, one of the most utilized formulations is given by (8).

2.2. Cholesky and modified Cholesky decomposition

The forms of the Cholesky and the modified Cholesky decomposition were discussed by Golub and Van Loan in [5]. If ARn×n is symmetric positive definite, then there exist a unique lower triangular LRn×n with positive diagonal entries such that A=LLT . If all the leading principal submatrices of ARn×n are non-singular, then there exist unique lower triangular matrices L and M and a unique diagonal matrix D=diagd1d2dn such that A=LDMT. If A is symmetric, then L=M and A=LDLT . Golub and Van Loan provide proofs of those decompositions, as well as various ways to compute them. We can use the Cholesky factor of a covariance matrix to quickly solve linear systems that involve a covariance matrix. Note that if we compute the Cholesky factorization and solve the triangular system Ly=b and LTx=y , then


For a symmetric matrix ARn×n , the modified Cholesky decomposition involves finding a non-negative diagonal matrix E such that A+E is positive definite. In particular, E=0 whenever A is positive definite, otherwise E should be sufficiently small. This allows applying the usual Cholesky factorization to A+E , i.e. finding matrices L,DRn×n such that A+E=LDLT .

Recall ΔXb=Xbx¯b1NTRn×N . Thus, ΔxbiN0nB for 1in . Let xbiRN×1 , the vector holding the ith row across all columns of ΔXb for 1in . Bickel and Levina in [1] discussed a modified Cholesky decomposition for the estimation of precision covariance matrix, and this allows in our context to estimate of B1 by fitting regressions of the form:


then, Lij=βi,j , for 1i<jn .

3. Filters based on modified Cholesky decomposition

3.1. Ensemble Kalman filter based on modified Cholesky

In the ensemble Kalman filter based on a modified Cholesky decomposition (EnKF-MC) [6], the analysis ensemble is estimated by using the following equations,




In this context, error correlations are estimated via a modified Cholesky decomposition. This provides an estimate of the inverse background error covariance matrix of the form,


or equivalently


where LRn×n is a lower-triangular matrix whose diagonal elements are all ones, and DRn×n is a diagonal matrix. Even more, when only local effects are considered during the estimation of B̂1 , sparse estimators of the precision background can be obtained. Furthermore, the matrix L can be sparse with only a few non-zero elements per row. Typically, the number of non-zero elements depends on the radius of influence during the estimation of background error correlations. For instance, in the one-dimensional case, the radius of influence denotes the maximum number of non-zero elements, per row, in L . The EnKF-MC is then obtained by plugging in the estimator (3) in (6). Given the structure of the Cholesky factors, the EnKF-MC can be seen as a matrix-free implementation of the EnKF.

Recall that the precision analysis covariance matrix reads,


moreover, since HTR1HRn×n can be written as a sum of m rank-one matrices, the factors (3) can be updated in order to obtain an estimate of the inverse analysis covariance matrix. The next section discussed such approximation in detail.

3.1.1. Posterior ensemble Kalman filter stochastic (PEnKF-S)

In the stochastic posterior ensemble Kalman filter (PEnKF-S) [7, 8], we want to estimate the moments of the analysis distribution,

based on the background ensemble Xb , where xa is the analysis state and ARn×n is the analysis covariance matrix. Consider the estimate of the inverse background error covariance matrix Eq. (3), the precision analysis covariance matrix Eq. (4) can be approximated as follows:


where X=HTR12Rn×m . The matrix (17) can be written as follows:


where xi denotes the ith column of matrix X , for 1im . Consider the sequence of factor updates,


where Li1pi=xiRn×1 , for 1im , B̂1=L0TD0L0 and


We can use of Dolittle’s method in order to compute the factors D˜i and L˜i in (9), and it is enough to note that,


After some math simplifications, the next equations are obtained,




for 1k<n and 1jk1 . The set of Eqs. (19) and (20) can be used in order to derive an algorithm for rank-one update of Cholesky factors, and the updating process is shown in Algorithm 1.

Algorithm 1. Rank-one update for the factors Li1 and Di1

1. function UPD_CHOLESKY_FACTORS Li1Di1xi

2.     Compute pi from LiTpi=xi

3.     for k=n1 do

4.       Compute d˜k via Eq. (10).

5.       Set lkk1 .

6.      for j=1k1 do

7.         Compute l˜kk according to (11).

8.      end for

9.     end for

10.   Set LiLi1Li1 and DiD˜i .

11.   return Li , Di

12.  end function

Algorithm 1 can be used in order to update the factors of B̂1 for all column vectors in X , and this process is detailed in Algorithm 2.

Algorithm 2. Computing the factors Lm and Dm of LmTDmLm


14.     Set XHTR12 .

15.     for i=1m do

16.        Set LiDiUPD_CHOLESKY_FACTORSLi1Di1xi

17.     end for

18.     return Lm , Dm

19. end function

Once the updating process has been performed, the resulting factors form an estimate of the inverse analysis covariance matrix,


From this covariance matrix, the posterior mode of the distribution can be approximated as follows:




with q=HTR1yHx¯bRn×1 . Note that the linear system (23) involves lower and upper triangular matrices, and therefore, x¯a can be estimated without the need of matrix inversion. Once the posterior mode is computed, the analysis ensemble is built about it. Note that  reads Â=Lm1Dm1LmT , and therefore, a square root of  can be approximated as follows:


which can be utilized in order to build the analysis ensemble,


where ΔXaRn×N is given by the solution of the linear system,


In addition, the columns of WRn×N are formed by samples from a multivariate standard normal distribution. Again, since Lm is lower triangular, the solution of (26) can be obtained readily.

3.1.2. Posterior ensemble Kalman filter deterministic (PEnKF-D)

The deterministic posterior ensemble Kalman filter (PEnKF-D) is a square root formulation of the PEnKF-S. The main difference between them lies on the use of synthetic data. In both methods, the matrices A1 and B1 are computed similarly (i.e., by using Dolittle’s method and the algorithms described in the previous section).

In the PEnKF-D, the computation of B̂1 is performed (13) based on the empirical moments of the ensemble. The analysis update is performed by using the perturbations


which are consistent with the dynamics of the numerical model, for instance, samples from (27) are driven by the physics and the dynamics of the numerical model. Finally, the analysis equation of the PEnKF-D






Since the moments of the posterior distribution are unchanged, we expect both methods PEnKF-S and PEnKF-D to perform equivalently in terms of errors during the estimation process.

3.2. Markov Chain Monte Carlo-based filters

Definitely, the EnKF represents a breakthrough in the data assimilation context, perhaps its biggest appeal is that we can obtain a closed form expression for the analysis members. Nevertheless, as can be noted, during the derivation of the analysis equations, some constrains are imposed, for instance, the observation operator is assumed linear, and therefore, the likelihood Pyx obeys a Gaussian distribution. Of course, in practice, this assumption can be easily broken, and therefore, bias can be introduced on the posterior prediction; this is notoriously significant if one considers the fields in which data assimilation lives are submerged. From a statistical point of view, more specifically, under the Bayesian framework, this situation can be summarized as follows:

  • The prediction is obtained from the posterior distribution: Pxy .

  • The information from numerical model is known like the prior of background: Pbx.

  • The information from the observations is incorporated through the likelihood: Pyx .

  • The posterior distribution is calculated: PxyPbxPyx .

  • If Pbx and Pyx are Gaussian, Pyx is Gaussian too, and the analysis is equal to the mean of Pyx .

A significant number of approaches have been proposed in the current literature in order to deal with these constraints; for instance, the particle filters (PFs) and the maximum likelihood ensemble filter (MLEF) are ensemble-based methods, which deal with nonlinear observation operators. Unfortunately, in the context of PF, its use is questionable under realistic weather forecast scenarios since the number of particles (ensemble members) increases exponentially regarding the number of components in the model state. Anyways, an extended analysis of these methods exceeds the scope of this document, but its exploration is highly recommended.

Taking samples directly from the posterior distribution is a strategy that can help to remove the bias induced by wrong assumptions on the posterior distribution (i.e., normal assumptions). We do not put the sights on finding the mode of the posterior distribution; instead of this, we want a set of state vectors that allow us to create a satisfactory representation of the posterior error distribution, then, based on these samples, it is possible to estimate moments of the posterior distribution from which the analysis ensemble can be obtained [9].

Hereafter, we will construct a modification of the sequential scheme of data assimilation; first, we will describe how to compute the analysis members by using variations in Markov Chain Monte Carlo (MCMC)-based methods, and then, we will include the modified Cholesky decomposition in order to estimate a precision background covariance.

An overview of the proposed method is as follows:

  1. The forecast step is unchanged; the forecast ensemble is obtained by applying the model to the ensemble of states of the previous assimilation cycle.

  2. The analysis step is modified, so that the analysis is not obtained anymore by, for instance (28), but we perform k iterations of an algorithm from the Markov Chain Monte Carlo (MCMC) family in order to obtain samples from the posterior distribution.

In order to be more specific in the explanation, let us define Metropolis-Hastings (MH) as the selected algorithm from the MCMC family. Now let us focus on MCMC in the most intuitive way possible. Let us define Jxy equal to the posterior pdf obtained from the expression PxyPbxPyx . MH explores the state space in order to include model states in a so-called Markov Chain. The selection of candidates is based on the condition xcy>Jxt1y , that is, if the value of J for a candidate xc is greater than that for a previous vector xt1 . Candidates are generated from a pre-defined proposal distribution Q ; generally, a Gaussian multivariate distribution with mean xt1 is chosen, and a covariance matrix is defined in order to handle the step sizes. Concisely, MCMC methods proceed as follows: at first iteration, the chain is started with xb vector. At end, the sample is obtained by extracting the last t vectors on the chain, and the other vectors are dismissed or “burned”. In (32), we can see the expression to calculate the acceptance probability that relates the target pdf, J , with the proposal Q


The terms that involve Q can be canceled if a symmetric distribution is chosen as the proposal one, for instance, in the Gaussian distribution case. The procedure for the calculation of the analysis applying MH is described below:

  1. Initialize the Markov Chain, C , assigning the background value xb to C0 .

  2. Generate a candidate vector state xc , from Q .

  3. Obtain U from a uniform (0, 1) distribution.

  4. If Uα Then: Ct+1=xc Else: Ct+1=Ct

  5. Repeat Steps 2 through 4, for t=1 until t=k1

  6. Remove the first p vectors of the chain, the burned ones.

The analysis is computed over the sample:


MCMC methods are straightforward to implement; when an enough number of iterations is performed, the behavior of the target function is captured on the chain [10]. This is true for even, complex density functions such as those with multiple modes. Briefly, let us focus on the fact that, generally, simulation-based methods such as MCMC explore a discretized grid, and as the mesh is refined, a huge number of iterations are needed before a high probability zone of the posterior distribution is reached [11]. Concretely, Hu et al. [12] proposed a family of modified MCMC dimension-independent algorithms under the name of preconditioned Crank-Nicolson (pCN) MCMC. These methods are robust regarding the curse of dimensionality in the statistical context. Initially, the Crank-Nicolson discretization is applied to a stochastic partial differential equation (SPDE) in order to obtain a new expression for the proposal distribution:


where L is the precision matrix of B , K is a preconditioning matrix, ε0 is white noise, if K=B and δ02 in (33), we get the pCN proposal described in (34):


where ω~N0B and β=8δ/2+δ2 . Of course, we use Pb like a computationally efficient estimation of B . The acceptance probability is defined in (35):


The procedure for the calculation of the analysis applying pCN Metropolis-Hastings is described as follows:

  1. Initialize the Markov Chain, C , assigning the background value xb to C0 .

  2. Generate a candidate vector state using (3): xc=1β212xt1+βω .

  3. Obtain U from a uniform (0, 1) distribution.

  4. If Uα , then Ct+1=xc . Else: Ct+1=Ct.

  5. Repeat Steps 2 through 4, for t=1 until t=k1 .

  6. Remove the first p vectors of the chain, the burned ones.

  7. The analysis is calculated over the sample: xa=1kpi=kpkCi .

Finally, in the data assimilation context, by using the modified Cholesky decomposition described in the earlier sections, the pCN-MH filter reads:

  1. The forecast step remains unchanged; the forecast ensemble is obtained by applying the model to the ensemble of states of the previous iteration, but the estimation of background covariance matrix is calculated by modified Cholesky.

  2. The update step is modified, so that the analysis is obtained by run k iterations of pCN-MH, to obtain a sample of the posterior error distribution.

Now we are ready to numerically test the methods discussed in this chapter.

4. Experimental results

4.1. Stochastic filter PEnKF-S

We assess the accuracy of the PEnKF-S and compare it against that of the LETFK implementation proposed by Hunt [13]. The numerical model is the Lorenz 96 model [11], which mimics the behavior of the atmosphere:


where n is the number of components in the model and F is an external force; with the value of the parameter, F=8 , the model presents a great entropy.

The experimental settings are described below:

  1. An initial random solution x3+ is propagated for a while in time by using Lorenz 96 model and a fourth-order Runge Kutta method in order to obtain a vector state x2+ whose physics are consistent with the dynamics of such numerical model. This vector state serves as our reference solution.

  2. The reference solution is perturbed by using samples from a normal distribution with parameters N0nσBI . Three different values for σB are considered during the numerical experiments σB0.050.100.15 . This perturbed state is propagated in time in order to make it consistent with the physics and dynamics of the numerical model. From here, an initial background state x1b is obtained.

  3. A similar procedure is performed in order to build a perturbed ensemble about x1b . The ensemble members are propagated in time from where an ensemble of model realizations x0bii=1N of model is obtained.

  4. The assimilation windows consist of 15 equidistant observations. The frequency of observations is 0.5 time units, which represents 3.5 days in the atmosphere.

  5. The dimension of the vector state is n=40 . The external force of the numerical model is set to F=80 .

  6. The number of observed components is 50% of the dimension of the vector state.

  7. Three ensemble sizes are tried during the experiments 204060 .

  8. As a measure of quality, the L2 norm of the analysis state and the reference solution are computed across assimilation steps.

  9. A total of 100 runs are performed for each pair NσB . For each run, a different initial random vector is utilized in order to build the initial perturbed reference solution x3+ (before the model is applied x2 ). This yields to different initial ensembles as well as synthetic data for the different runs of each configuration (pair).

The average of the error norms of each pair NσB for the LETKF and the PEnKF implementations is shown in Table 1 . As can be seen, on average across 100 runs, the performance of the proposed EnKF implementation outperforms that of the LETKF in terms of L2 norm of the error. Even more, the PEnKF-S seems to be invariant to the initial background error σB since, in all cases, when the ensemble size is increased a better estimation of the reference state x at different observation times is obtained. This can also obey to the estimation of background error correlations via the modified Cholesky decomposition since it is drastically improved whenever the ensemble size is increased as pointed out by Bickel and Levina in [1]. In such case, the error decreases by Ologn/N . This is crucial in the PEnKF-S formulation since estimates of the precision analysis covariance matrix are obtained by rank-one updates on the inverse background error covariance matrix. On the other hand, in the LETKF context, increasing the ensemble size can improve the accuracy of the method but that is not better than the one shown by the PEnKF.


Table 1.

Average of L-2 norm of errors for 100 runs of each configuration σBN for the compared filter implementations.

Some plots of the L2 norm of error for the PEnKF and the LETKF across different configurations and runs are shown in Figure 1 . Note that the error of the PEnKF decreases aggressively since the earlier iterations. In the LETKF context, the accuracy is similar to that of the PEnKF only at the end of the assimilation window.


Figure 1.

Local ensemble transform Kalmar filter (LETKF).

4.2. Deterministic filter PEnKF-D

Holding the settings from the previous section, experiments are performed by using the PEnKF-D. The results have similar behavior to that of the LETFK and the PEnKF-S implementation proposed by Niño et al. in [8]. This can be appreciated in Figures 1 3 . As can be seen, the error distributions reveal similar behavior across all compared filters.


Figure 3.

Posterior ensemble Kalmar filter deterministic (PEnKF-D).


Figure 2.

Posterior ensemble Kalmar filter stochastic (PEnKF-S).

4.3. MCMC filter based on modified Cholesky decomposition

We will describe experiments of our proposed filter based on a modified Cholesky decomposition and the preconditioned Crank-Nicolson Metropolis-Hastings. Again, the numerical model is the Lorenz 96 model. In this section, we are mainly interested on describing the behavior of the method, especially when the observational operator is nonlinear.

The experimental settings are as follows:

  1. The observational operator is quadratic: Hxi=xi2 , where xi denotes the ith component of the vector state x .

  2. The vector of states has n=40 components.

  3. The ensemble size is N=20 .

  4. The length of the Markov Chain is 1000.

  5. The proposal is pCN with β=0.09 .

Figure 4 describes the distribution of six of the 40 components of the state vector after 1000 iterations of the algorithm, attempting to visualize if the actual value is within or near the region of the highest probability density described by the sample contained in the Markov Chain.


Figure 4.

Histogram of six components of the state vector after 1000 iterations of the algorithm, the dashed lines indicate the position of the true value for the respective component.

Note that, not in all cases, the actual values of the model components were located at the peaks of the histogram, but most of them were within or near to zones of high probability described by the sample. This result is important if we take into account that probably the posterior distribution is not normal as is the case of quadratic observation operators.

5. Conclusions

In this chapter, efficient EnKF implementations were discussed. All of them are based on a modified Cholesky decomposition wherein a precision background covariance is obtained in terms of Cholesky factors. In the first filter, the PEnKF-S, synthetic data are utilized in order to compute the posterior members, as done in stochastic formulations of the filter. Even more, a sequence of rank-one updates can be applied over the factors of the prior precision matrix in order to estimate those of the posterior precision. In the second filter, the PEnKF-D, synthetic data are avoided by using perturbations obtained from the physics and the dynamics of the numerical model. Finally, a MCMC-based filter is obtained in order to reduce the impact of bias when Gaussian assumptions are broken during the assimilation of observations, for instance, when the observation operator is nonlinear. Numerical experiments with the Lorenz 96 model reveal that the proposed filters are comparable to filters from the specialized literature.


This work was supported by the Applied Math and Computational Science Lab (AML-CS) at Universidad del Norte in Barranquilla, Colombia.


1 - Bickel PJ, Levina E. Covariance regularization by thresholding. The Annals of Statistics. 2008;36(6):2577-2604. DOI: 10.1214/08-AOS600
2 - Evensen G. The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dynamics. 2003;53(4):343-367
3 - Burgers G, Jan van Leeuwen P, Evensen G. Analysis scheme in the ensemble Kalman filter. Monthly Weather Review 1998;126(6):1719-1724
4 - Sakov P, Oke PR. A deterministic formulation of the ensemble Kalman filter: An alternative to ensemble square root filters. Tellus A. 2008;60(2):361-371
5 - Coleman TF, Van Loan C. Handbook for matrix computations. Society for Industrial and Applied Mathematics, 1988
6 - Nino-Ruiz ED, Sandu A, Deng X. A parallel implementation of the ensemble Kalman filter based on modified Cholesky decomposition. Journal of Computational Science. 2017
7 - Nino-Ruiz ED, Mancilla A, Calabria JC. A posterior ensemble Kalman filter based on a modified Cholesky decomposition. Procedia Computer Science. 2017;108:2049-2058
8 - Nino-Ruiz ED. A matrix-free posterior ensemble Kalman filter implementation based on a modified Cholesky decomposition. Atmosphere. 2017;8(7):125
9 - Attia A, Sandu A. A hybrid Monte Carlo sampling filter for non-gaussian data assimilation. AIMS Geosciences. 2015;1:41-78
10 - David H, Shane Reese C, David Moulton J, Vrugt JA, Colin F. Posterior exploration for computationally intensive. In: Steve B, Andrew G, Jones GL, Meng X-L, editors. Handbook of Markov Chain Monte Carlo. Chapman & Hall; 2011. p. 401-418
11 - Cotter SL, Roberts GO, Stuart AM, White D, et al. MCMC methods for functions: Modifying old algorithms to make them faster. Statistical Science. 2013;28(3):424-446
12 - Hu Z, Yao Z, Li J. On an adaptive preconditioned Crank--Nicolson MCMC algorithm for infinite dimensional Bayesian inference. Journal of Computational Physics. 2017;332:492-503
13 - Fatkullin I, Vanden-Eijnden E. A computational strategy for multiscale systems with applications to Lorenz 96 model. Journal of Computational Physics. 2004;200(2):605