The time-mean analysis RMSE and the time-mean objective function values in 4 EnKF schemes for Lorenz-96 model when observational errors are spatially correlated and their covariance matrix is correctly specified: (1) EnKF (non-inflation); (2) the SLS scheme in Section 2.1 (SLS); (3) the SLS scheme in Section 2.2 (SLS and new structure); (4) the SLS scheme in the discussion (SLS and true ensemble forecast error). The forcing term F = 12.
The estimation accuracy of ensemble forecast errors is crucial to the assimilation results for all ensemble-based schemes. The ensemble Kalman filter (EnKF) is a widely used scheme in land surface data assimilation, without using the adjoint of a dynamical model. In EnKF, the forecast error covariance matrix is estimated as the sampling covariance matrix of the ensemble forecast states. However, past researches on EnKF have found that it can generally lead to an underestimate of the forecast error covariance matrix, due to the limited ensemble size, as well as the poor initial perturbations and model error. This can eventually result in filter divergence. Therefore, using inflation to further adjust the forecast error covariance matrix becomes increasingly important. In this chapter, a new structure of the forecast error covariance matrix is proposed to mitigate the problems with limited ensemble size and model error. An adaptive procedure equipped with a second-order least squares method is applied to estimate the inflation factors of forecast and observational error covariance matrices. The proposed method is tested on the well-known atmosphere-like Lorenz-96 model with spatially correlated observational systems. The experiment results show that the new structure of the forecast error covariance matrix and the adaptive estimation procedure lead to improvement of the analysis states.
- data assimilation
- ensemble Kalman filter
- error covariance inflation
- observation-minus-forecast residual
- least squares
For state variables in geophysical research fields, a common assumption is that systems have “true” underlying states. Data assimilation is a powerful mechanism for estimating the true trajectory based on the effective combination of a dynamic forecast system (such as a numerical model) and observations . It can produce an optimal combination of model outputs and observations . The combined result is called analysis state, which should be closer to the true state than either the model forecast or the observation is. In fact, the analysis state can generally be treated as the weighted average of the model forecasts and observations, while the weights are approximately proportional to the inverse of the corresponding covariance matrices . Therefore, the results of any data assimilation depend crucially on the estimation accuracy of the forecast and observational error covariance matrices . If these matrices are estimated correctly, then the analysis states can be generated by minimizing an objective function which is technically straightforward and can be accomplished using existing engineering solutions , although finding the appropriate analysis state is still a quite difficult problem when the models are nonlinear [6, 7].
The ensemble Kalman filter (EnKF) is a widely used sequential data assimilation approach, which has been studied and applied since it is proposed by Evensen . It is a practical ensemble-based assimilation scheme that estimates the forecast error covariance matrix using a Monte Carlo method with the short-term ensemble forecast states . In EnKF, the forecast error covariance matrix is estimated as the sampling covariance matrix of the ensemble forecast states, which is usually underestimated due to the limited ensemble size and model error . This finding indicates that the filter is over reliant on the model forecasts and excludes the observations. It may eventually lead to the divergence of the EnKF assimilation scheme [11, 12].
Therefore, the forecast error covariance inflation technique to address this problem becomes increasingly important. One of the error covariance matrix inflation techniques is additive inflation, in which a noise is added to the ensemble forecast states that sample the probability distribution of model error [13, 14]. Another widely used error covariance matrix inflation technique is multiplicative inflation, that is, to multiply the matrix by an appropriate factor. It can be used to mitigate filter divergence by inflating the empirical covariance and increasing the robustness of the filter .
In early studies of multiplicative inflation, researchers determine the inflation factor by repeated experimentation and choose a value according to their prior knowledge . Hence, such experimental tuning is rather empirical and subjective. It is not appropriate to use the same inflation factor during all the assimilation procedure. Too small or too large an inflation factor will cause the analysis state to over rely on the model forecasts or observations and can seriously undermine the accuracy and stability of the filter. In later studies, the inflation factor is estimated online based on the observation-minus-forecast residual (innovation statistic) [16, 17] with different conditions.
Past work shows that moment estimation can facilitate the calculation by solving an equation of the observation-minus-forecast residual and its realization [18–20]. Maximum likelihood approach can obtain a better estimate of the inflation factor than moment approach, although it must calculate a high-dimensional matrix determinant [21–24]. Bayesian approach assumes a prior distribution for the inflation factor but is limited by spatially independent observational errors [25, 26]. Second-order least square estimation focus on minimizing the second-order least squares (SLS)  statistic of the squared observation-minus-forecast residual, which is not very expensive [28–30]. Generalized Cross Validation (GCV) [31, 32] can select a regularization parameter by minimizing the predictive data errors with rotation invariant in a least squares solution .
In practice, the observational error covariance matrix may also need to be adjusted, and an approach can be used to simultaneously optimize inflation factors of both forecast and observational error covariance matrices . This approach is based on the optimization of the likelihood function of observation-minus-forecast residual. However, the likelihood function of observation-minus-forecast residual is nonlinear and involves the computationally expensive determinant and inverse of the residual covariance matrix. As compensation, the second-order least squares statistic of the squared observation-minus-forecast residual can be used as the cost function instead. The main advantage of the SLS cost function is that it is a quadratic function of the inflation factors, and therefore, the analytic forms of the estimators of the inflation factors can be easily obtained. Compared with the method based on maximum likelihood estimation method, the computational cost is significantly reduced.
Furthermore, unlike the sampling covariance matrix of the ensemble forecast states used in the conventional EnKF, a new structure of the forecast error covariance matrix is proposed in this chapter. In ideal situation, an ensemble forecast state is assumed as a random vector with the true state as its ensemble mean. Hence, it is should be defined that the ensemble forecast error is the ensemble forecast states minus true state rather than minus their ensemble mean . This is because in a forecast model with large error and limited ensemble size, the ensemble mean of the forecast states can be very far from the true state. Therefore, the sampling covariance matrix of the ensemble forecast states can be very different from the true forecast error covariance matrix. As a result, the estimated analysis state can be substantially inaccurate. However, in reality, the true state is unknown, but the analysis state is a better estimate of the true state than the forecast state. Therefore, the information feedback from the analysis state can be used to revise the forecast error covariance matrix. In fact, the proposed forecast error covariance matrix is a combination of multiplicative and additive inflation. Bai and Li  also used the feedback from the analysis state to improve assimilation but in a different way.
This chapter consists of four sections. The EnKF scheme with a new structure of the forecast error covariance matrix and the adaptive estimation procedure is proposed in Section 2. The assimilation results on Lorenz model with a correlated observational system are presented in Section 3. Conclusions and discussion are given in Section 4.
2.1. EnKF with SLS inflation scheme
Using the uniform notations for consistency, a nonlinear discrete-time forecast and linear observational system is written as 
where i is the time index; is the n-dimensional true state vector at time step i; is the n-dimensional analysis state vector which is an estimate of , is a nonlinear forecast operator such as a weather forecast model; is an observational vector with dimension ; is an observational matrix of dimension that maps model states to the observational space; and are the forecast error vector and the observational error vector respectively, which are assumed to be statistically independent of each other, time-uncorrelated, and have mean zero and covariance matrices and , respectively. The goal of the EnKF assimilation is to find a series of analysis states that are sufficiently close to the corresponding true states , using the information provided by and .
It is well-known that any EnKF assimilation scheme should include a forecast error inflation scheme. Otherwise, the EnKF may diverge . A procedure for estimating multiplicative inflation factor of and adjustment factor of can be carried out based on the SLS principle. The basic filter algorithm uses perturbed observations , but without localization . The estimation steps of this algorithm equipped with SLS inflation are as follows.
Step 1. Calculate the perturbed forecast states
where is the perturbed analysis states derived from the previous time step (and m is the ensemble size).
Step 2. Estimate the improved forecast and observational error covariance matrices.
The forecast state is defined as the ensemble mean of and the initial forecast error covariance matrix is expressed as
and the initial observational error covariance matrix is . Then, the adjusted forms of forecast and observational error covariance matrices are and , respectively.
There are several approaches for estimating the inflation factors and . Wang and Bishop , Li et al. , and Miyoshi  use the first-order least square of the squared observation-minus-forecast residual to estimate ; Liang et al.  maximizes the likelihood of to estimate and . Here, the SLS approach is applied for estimating and . That is, and are estimated by minimizing the objective function
This leads to
Step 3. Compute the perturbed analysis states.
where is a normal random variable with mean zero and covariance matrix . Here can be effectively computed using the Sherman-Morrison-Woodbury formula [21, 37, 38]. Furthermore, the analysis state is estimated as the ensemble mean of . Finally, set and return to Step 1 for the assimilation at next time step.
2.2. EnKF with SLS inflation and new structure of forecast error covariance matrix
By Eqs. (1) and (3), the ensemble forecast error is defined as . On the other hand, is an estimate of without knowing observations. The ensemble forecast error is initially estimated as , which is used to construct the forecast error covariance matrix in Section 2.1. However, due to limited sample size and model error, can be far from . Therefore, can be a biased estimate of .
Here, the observations can be used for improving the estimation accuracy of ensemble forecast error. The basic sense is as follows: After the analysis state is derived, it should be a better estimate of than the forecast state . Therefore, in Eq. (4) is substituted by for generating a revised forecast error covariance matrix. This procedure can be repeated iteratively until the corresponding objective function (Eq. (5)) converges. For the computational details, Step 2 in Section 2.1 is modified to the following adaptive procedure:
Step 2a. Use Step 2 in Section 2.1 to inflate the initial forecast error covariance matrix to and adjust initial observational error covariance matrix to . Then use Step 3 in Section 2.1 to estimate the initial analysis state and set k = 1.
Step 2b. Update the forecast error covariance matrix as
Then, adjust the forecast and observational error covariance matrices to and, where
are estimated by minimizing the objective function.
If , where is a pre-determined threshold to control the convergence of Eq. (12) and then estimate the k-th updated analysis state as
set k = k + 1 and return back to Eq. (9); otherwise, take and as the estimated forecast and observational error covariance matrices at i-th time step and go to Step 3 in Section 2.1.
which is a multiplicatively inflated sampling error covariance matrix plus an additive inflation matrix (see Appendix B for the proof).
2.3.1. Correctly specified observational error covariance matrix
If the observational error covariance matrix is correctly known, then its adjustment is no longer required. In this case, the inflation factor can be estimated by minimizing the following objective function
This leads to a simpler estimate
2.3.2. Validation statistics
In any toy model, the “true” state is known by experimental design. In this case, the root-mean-square error (RMSE) of the analysis state can be used to evaluate the accuracy of the assimilation results. The RMSE at i-th step is defined as
where and are the k-th components of the analysis state and true state at the i-th time step. In principle, a smaller RMSE indicates a better performance of the assimilation scheme.
3. Experiment on Lorenz 96 model
In this section, the EnKF with SLS inflation assimilation scheme is applied to a nonlinear dynamical system, which has properties relevant to realistic forecast problems: the Lorenz-96 model  with model error and a linear observational system. The performances of the assimilation schemes in Section 2 are evaluated through the following experiments.
3.1. Description of forecast and observational systems
The Lorenz-96 model  is a strongly nonlinear dynamical system with quadratic nonlinearity, which is governed by the equation.
where (; hence, there are 40 variables). For Eq. (18) to be well-defined for all values of k, it is defined that . The dynamics of Eq. (18) are “atmosphere-like” in that the three terms on the right-hand side consist of a nonlinear advection-like term, a damping term, and an external forcing term respectively. These three terms can be thought of as some atmospheric quantity (e.g., zonal wind speed) distributed on a latitude circle. Therefore, the Lorenz-96 model has been widely used as a test bed to evaluate the performance of assimilation schemes in many studies .
The true state is derived by a fourth-order Runge–Kutta time integration scheme . The time step for generating the numerical solution was set at 0.05 nondimensional units, which is roughly equivalent to 6 hours in real time, assuming that the characteristic time-scale of the dissipation in the atmosphere is 5 days . The forcing term was set as F = 8, so that the leading Lyapunov exponent implies an error-doubling time of approximately 8 time steps, and the fractal dimension of the attractor was 27.1 . The initial value was chosen to be when and .
In this study, the synthetic observations were assumed to be generated by adding random noises that were multivariate-normally distributed with mean zero and covariance matrix to the true states. The frequency was four time steps, which can be used to mimic daily observations in practical problems, such as satellite data. The observation errors were assumed to be spatially correlated, which is common in applications involving remote sensing and radiance data. The variance of the observation at each grid point was set to , and the covariance of the observations between the j-th and k-th grid points was as follows:
Since it can deal with spatially correlated observational errors, the scheme may potentially be applied for assimilating remote sensing observations and radiances data.
The model errors by changing the forcing term are added in the forecast model because it is inevitable in real dynamic systems. Thus, different values of F are chose in the assimilation schemes while retaining F = 8 when generating the “true” state. The observations are simulated every four time steps analogizing 1 day in realistic problem for 2000 steps to ensure robust results. The ensemble size is used as 30. The pre-determined threshold to control the convergence of Eq. (12) is set to be 1, because the values of objective functions are in the order of 105. In most cases of the following experiment, the objective functions converge after 3–4 iterations, and the estimated analysis states also converge.
3.2. Comparison of assimilation schemes
In Section 2.1, the EnKF assimilation scheme with SLS error covariance matrix inflation is outlined. In Section 2.2, the improved EnKF assimilation scheme with the SLS error covariance matrix inflation and the new structure of the forecast error covariance matrix are summarized. In the following, the influences of these estimation methods on EnKF data assimilation schemes are assessed using Lorenz-96 model.
Lorenz-96 model is a forced dissipative model with a parameter F that controls the strength of the forcing (Eq. (18)). The model behaviors are quite different with different values of F, and chaotic systems are produced with integer values of F larger than 3. Therefore, several values of F are used to simulate a wide range of model errors. In all cases, the true states were generated by a model with F = 8. These observations were then assimilated into models with F = 4, 5, …, 12.
3.2.1. Correctly specified observational error covariance matrix
Suppose the observational error covariance matrix is correctly specified, the inflation adjustment on is taken in each assimilation cycle and estimate the inflation factors by the methods described in Section 2.1. Then, the adaptive assimilation schemes with the new structure of the forecast error covariance matrix proposed in Section 2.2 are conducted.
Figure 2 shows the time-mean analysis RMSE of the two assimilation schemes averaged over 2000 time steps, as a function of F. Overall, the analysis RMSE of the two assimilation schemes gradually grows as increasing model error. When F is near the true value 8, the two assimilation schemes have almost indistinguishable values of the analysis RMSE. However, when F becomes increasingly distant from 8, the analysis RMSE of the assimilation scheme with the new structure of the forecast error covariance matrix becomes progressively smaller than that of the assimilation scheme with the forecast error covariance matrix inflation only.
For the Lorenz-96 model with large error (such as, the case with F = 12), the time-mean analysis RMSE of the two assimilation schemes is listed in Table 1, as well as the time-mean values of the objective functions. The conventional EnKF assimilation scheme is also included for comparison. These results show clearly that our two schemes have significantly smaller RMSE than the EnKF assimilation scheme. Moreover, the assimilation scheme with the new structure of the forecast error covariance matrix performs much better than assimilation scheme with forecast error covariance matrix inflation only.
|EnKF schemes||Time-mean RMSE||Time-mean L|
|SLS and new structure||1.22||38,125|
|SLS and true ensemble forecast error||0.48||19,652|
3.2.2. Incorrectly specified observational error covariance matrix
In this section, the observational error covariance matrix is supposed to be correct only up to a constant factor. The factor is estimated using different estimation methods, and the corresponding assimilation results are evaluated.
The observational error covariance matrix is set as four times of the true matrix and introduces another factor to adjust . The assimilation schemes are conducted in two cases: (1) inflate the forecast and observational error covariance matrices only (Section 2.1); (2) inflate the forecast and observational error covariance matrices and use the new structure of the forecast error covariance matrix (Section 2.2). Again, the forcing term F takes values 4, 5, …, 12 when assimilating observations, but F = 8 is used when generating the true states in all cases.
Figure 3 shows the time-mean analysis RMSE of the two cases averaged over 2000 time steps, as a function of forcing term. Generally speaking, the analysis RMSE of the two cases gradually grows as the increasing the model error. However, the analysis RMSE generated by using new structure of the forecast error covariance matrix (cases 2) is smaller than those by using the error covariance matrices inflation technique only (cases 1).
For the Lorenz-96 model with forcing term F = 12, the time-mean analysis RMSE of the two cases is listed in Table 2, along with the time-mean values of the objective functions. These results clearly show that when the observational error covariance matrix is incorrectly specified, the assimilation result is much better if the new structure of the forecast error covariance matrix is used (cases 2).
|EnKF schemes||Ensemble size 30||Ensemble size 20|
|Time-mean RMSE||Time-mean L||Time-mean RMSE||Time-mean L|
|SLS and new structure||1.35||41,326||1.45||95,685|
|SLS and true ensemble forecast error||0.58||21,585||0.60||21,355|
The estimated over 2000 time steps in the two cases of using the new structure of the forecast error covariance matrix (cases 2) are plotted in Figure 4. It can be seen that the time-mean value of estimated is 0.45, which is very close to the reciprocal of the constant that is multiplied to the observational error covariance matrix (0.25).
To further investigate the effect of ensemble size on the assimilation result, Figure 3 is reproduced with the ensemble size 20. The results are shown in Figure 5, as well as in Table 2. Generally speaking, Figures 5 is quite similar to Figure 3 but with larger analysis error. This indicates that the smaller ensemble size can lead to the larger forecast error and analysis error. The analysis is also repeated with the ensemble size 10. However in this case, both the inflation and new structure are not effective. This could be due to that the ensemble size 10 is too small to generate robust covariance estimation.
4. Discussion and conclusions
It is well-known that accurately estimating the error covariance matrix is one of the most key steps in any ensemble-based data assimilation. In EnKF assimilation scheme, the forecast error covariance matrix is initially estimated as the sampling covariance matrix of the ensemble forecast states. But due to limited ensemble size and model error, the forecast error covariance matrix is usually an underestimation, which may lead to the divergence of the filter. Therefore, the initially estimated forecast error covariance matrix is multiplied by an inflation factor , and the SLS estimation is proposed to estimate this factor.
In fact, the true forecast error should be represented as the ensemble forecast states minus the true state. However, since in real problems, the true state is not available, the ensemble mean of the forecast states is used instead. Consequently, the forecast error covariance matrix is initially represented as the sampling covariance matrix of the ensemble forecast states. However, for the model with large error, the ensemble mean of the forecast states may be far from the true state. In this case, the estimated forecast error covariance matrix will also remain far from the truth, no matter which inflation technique is used.
To verify this point, a number of EnKF assimilation schemes with necessary error covariance matrix inflation are applied to the Lorenz-96 model but with the forecast state in the forecast error covariance matrix (Eq. (4)) substituted by the true state . The corresponding RMSE are shown in Figures 2–5 and Tables 1 and 2. All the figures and tables show that the analysis RMSE is significantly reduced.
However, since the true state is unknown, the analysis state is used to replace the forecast state , because is closer to than . To achieve this goal, a new structure of the forecast error covariance matrix and an adaptive procedure for estimating the new structure are proposed here to iteratively improve the estimation. As shown in this chapter, the RMSE of the corresponding analysis states are indeed smaller than those of the EnKF assimilation scheme with the error covariance matrix inflation only. For instance, in the experiment in Section 3.1, when the error covariance matrix inflation technique is applied, the RMSE is 1.89 which is much smaller than that for the original EnKF. When the new structure of the forecast error covariance matrix is used in addition to the inflation, the RMSE is reduced to 1.22 (see Table 1).
In the realistic problems, the observational error covariance matrix is not always correctly known, and hence it also needs to be adjusted too. Another factor is introduced to adjust the observational error covariance matrix in this chapter, which can be estimated simultaneously with by minimizing the second-order least squares function of the squared observation-minus-forecast residual.
The second-order least squares function of the squared observation-minus-forecast residual can be a good objective function to quantify the goodness of fit of the error covariance matrix. The SLS method proposed in this chapter can be used to estimate the factors for adjusting both the forecast and observational error covariance matrices, while the first order method can only estimate the inflation factor of the forecast error covariance matrix. The SLS can also provide a criterion for stopping the iteration in the adaptive estimation procedure when the new structure of the forecast error covariance matrix is used. This is important for preventing the proposed forecast error covariance matrix to depart from the truth in the iteration. In most cases in this study, the minimization algorithms converge after several iterations, and the objective function decreases sharply. On the other hand, the improved forecast error covariance matrix indeed leads to the improvement of analysis state. In fact, as shown in Tables 1-2, a small objective function value always corresponds to a small RMSE of the analysis state.
The difference of the EnKF assimilation scheme with SLS inflation is compared to that with maximum likelihood estimation (MLE) inflation . Generally speaking, the RMSE of the analysis state derived using the MLE inflation scheme is a little smaller than that derived using the SLS inflation scheme only but is larger than that derived using the SLS inflation with the new structure of forecast error covariance matrix. For instance, for Lorenz-96 model with forcing term F = 12, the RMSE is 1.69 for MLE inflation, 1.89 for SLS inflation only, and 1.22 for SLS inflation and new structure (Table 1). Whether this is a general rule or not is still unclear and is subject to further investigation. However, in MLE inflation scheme, the objective function is nonlinear and especially involves the determinant of the observation-minus-forecast residual’s covariance matrix, which is quite computationally expensive. The SLS objective function proposed in this chapter is quadratic, so its minimizer is analytic and can be easily calculated.
On the other hand, similar to other inflation schemes with single factor, this study also assumes the inflation factor to be constant in space. Apparently, this is not the case in many practical applications, especially for the cases that the observations are unevenly distributed. Persistently applying the same inflation values that are reasonably large to address problems in densely observed areas to all state variables can systematically overinflate the ensemble variances in sparsely observed areas [13, 26, 42]. Even when the adaptive procedure for estimating the error covariance matrix is applied, the problem may still exist in some extent. In the two case studies conducted here, the observational systems are relatively evenly distributed.
In the future study, we will investigate how to modify the adaptive procedure to suit the system with unevenly distributed observations. We also plan to apply our methodology to error covariance localization [43, 44] and to validate the proposed methodologies using more sophisticated dynamic and observational systems.
This work is supported by the National Natural Science Foundation of China (grant no. 91647202), the National Basic Research Program of China (grant no. 2015CB953703), the National Natural Science Foundation of China (grant no. 41405098) and the Fundamental Research Funds for the Central Universities. The authors gratefully acknowledge the anonymous reviewers for their constructive and relevant comments, which helped greatly in improving the quality of this manuscript. The authors are also grateful to the editors for their hard work and suggestions on this manuscript. Parts of this chapter are reproduced from the authors’ previous publications [29, 30].
The forecast error covariance matrix is inflated to . The estimation of the inflation factors is based on the observation-minus-forecast residual
The covariance matrix of the random vector can be expressed as a second-order regression equation :
where E is the expectation operator and is the error matrix. The left-hand side of (A2) can be decomposed as
Since the forecast and observational errors are statistically independent, we have
From Eq. (2), is the observational error at i-th time step, and hence
Further, since the forecast state is treated as a random vector with the true state as its population mean,
It follows that the second-order moment statistic of error can be expressed as
Therefore, can be estimated by minimizing objective function . Since is a quadratic function of with positive quadratic coefficients, the inflation factor can be easily expressed as
As a bivariate function of and , the first partial derivative with respect to the two parameters respectively are
Since is the ensemble mean forecast, we have
That is, the last two terms of Eq. (B1) vanish. Therefore, the proposed forecast error covariance matrix can be expressed as