Estimates of
Abstract
This chapter is devoted to different types of optimal perturbations (OP), deterministic, stochastic, OP in an invariant subspace, and simultaneous stochastic perturbations (SSP). The definitions of OPs are given. It will be shown how the OPs are important for the study on the predictability of behavior of system dynamics, generating ensemble forecasts as well as in the design of a stable filter. A variety of algorithm-based SSP methodology for estimation and decomposition of very high-dimensional (Hd) matrices are presented. Numerical experiments will be presented to illustrate the efficiency and benefice of the perturbation technique.
Keywords
- predictability
- optimal perturbation
- invariant subspace
- simultaneous stochastic perturbation
- dynamical system
- filter stability
- estimation of high-dimensional matrix
1. Introduction
Study in high-dimensional systems (HdS) today constitutes one of the most important research subjects thanks to the exponential increase of the power and speed of computers: after Moore’s law, the number of transistors in a dense integrated circuit doubles approximately every 2 years (see Myhrvold [1]). However, this exponential increase is still far from being sufficient for responding to great demand on computational and memory resources in implementing the optimal data assimilation algorithms (like Kalman filter (KF) [2], for example) for operational forecasting systems (OFS).
This chapter is devoted to the role of perturbations as an efficient tool for predictability of dynamical system, ensemble forecasting and for overcoming the difficulties in the design of data assimilation algorithms, in particular, of the optimal adaptive filtering for extremely HdS. In [3, 4], Lorenz has studied the problem of predictability of the atmosphere. It is found that the atmosphere is a chaotic system and a predictability limit to numerical forecast is of about 2 weeks. The barrier of predictability has to be overcome in order to increase the time period of a forecast further. The fact that estimates of the current state are inaccurate and that numerical models have inadequacies leads to forecast errors that grow with increasing forecast lead time. Ensemble forecasting aims at quantifying this flow-dependent forecast uncertainty. Today, a medium-range forecast has become a standard product. In the 1990s, the ensemble forecasting (EnF) technique was introduced in operational centers such as the European Centre for Medium-range Weather Forecast (ECMWF) (see Palmer et al. [5]), the NCEP (US National Center for Environmental Prediction) (Toth and Kalnay [6]). It is found that a single forecast can depart rapidly from the real atmosphere. The idea of the ensemble forecasting is to add the perturbations around the control forecast to produce a collection of forecasts that try to better simulate the possible uncertainties in a numerical forecast. The ensemble mean can then act as a nonlinear filter such that its skill is higher than that of individual members in a statistical sense (Toth and Kalnay [6]).
The chapter is organized as follows. Section 2 outlines first the optimal perturbation (OP) theory, on how the OP plays the important role for seeking the most growing direction of prediction error (PE). The predictability theory of the dynamical system as well as a stability of the filtering algorithm all are developed on the basis of OP. The definition of the optimal deterministic perturbation (ODP) and some theoretical results on the ODP are introduced. It is found that the ODP is associated with the right singular vector (SV) of the system dynamics. In Section 3, the two other classes of ODPs are presented: the leading eigenvector (EV) and real Schur vector (SchV) of the system dynamics. Mention that the first EV is the ODS in the eigen invariant subspace (EI-InS) of the system dynamics. As to the leading SchV, it is ODS in the Schur invariant subspace (Sch-InS) which is closely related to the EI-InS in the sense that the subspace of the leading SchVs, generated by the sampling procedure (Sampling-P, Section 3), converges to the EI-InS. In Section 4, we present the other type of OP called as optimal stochastic perturbation (OSP). Mention that the OSP is a natural extension of the ODP which gives insight into understanding of what represents the most growing PE and how one can produce it by stochastically perturbing the initial state. One important class of perturbations (known as simultaneous stochastic perturbation—SSP) is presented in Section 5. It will be shown that the SSP is very efficient for solving optimization problems in high-dimensional (Hd) setting. The different algorithms for estimating, decomposing … Hd matrices are also presented here. Numerical examples are presented in Section 6 for illustrating the theoretical results and efficiency of the OPs in solving data assimilation problems. The experiment on data assimilation in the Hd ocean model MICOM by the filters constructed on the basis of the Schur ODSs and SSPs is presented in Section 7. The concluding remarks are presented in Section 8.
2. Optimal perturbations: predictability and filter stability
2.1. Stability of filter
The behavior of atmosphere or ocean is recognized as highly sensitive to initial conditions. It means that a small change in an initial condition can alter strongly the trajectory of the system. It is therefore important to be able to know about the directions of rapid growth of the system state. The research on OP is namely aimed at finding the methods to better capture these rapidly growing directions of the system dynamics, to optimize the predictability of the physical process under consideration.
To explain this phenomenon more clearly, consider a standard linear filtering problem
where
where
From Eq. (2), it can be shown that the transition matrix for the filtered estimate equation is expressed by
For HdS, the KF gain (3) is impossible to compute. In a study by Hoang et al. [7], it is suggested to find the gain with the structure
with
2.2. Singular value decomposition and optimal perturbations
Consider the singular value decomposition (SVD) of
where
Suppose the system is
On the other hand, in practice, for extreme HdS, one cannot compute all elements of
2.3. Optimal perturbation
Let
One sees the perturbation
In general, the perturbation
The ensemble-based filtering (EnBF) algorithm is simplified for the standard linear filtering problems with
2.3.1. Optimal deterministic perturbation
Introduce
where
under the constraint (7). One can prove
where
2.3.2. Subspaces of ODPs
Introduce
Consider the optimization problem
under the constraint (7). Similar to the proof of Lemma 2.1, one can prove
where
By iteration, for
applying Lemma 2.2 with slight modifications, one finds that the OPs for
The OP for
The weighted norm
As to the PE, a normalization is also applied in order to have a possibility to compare different variables like density, temperature, velocity … In this situation, the norm for
2.4. Ensemble forecasting
The idea of ensemble forecasting is that instead of performing “deterministic” forecasts, stochastic forecasts should be made: several model forecasts are performed by introducing perturbations in the filtered estimate or in the models.
Since 1994, NCEP (National Centers for Environmental Prediction, USA) has been running 17 global forecasts per day, with the perturbations obtained using the method of breeding growing perturbations. This ensures that the perturbations contain growing dynamical perturbations. The length of the forecasts allows the generation of outlook for the second week. At the ECMWF, the perturbation method is based on the use of SVs, which grow even faster than the bred or Lyapunov vector perturbations. The ECMWF ensemble contains 50 members [13].
3. Perturbations based on leading EVs and SchVs
3.1. Adaptive filter (AF)
The idea underlying the AF is to construct a filter which uses feedback in the form of the PE signal (innovation) to adjust the free parameters in the gain to optimize the filter performance. If in the KF, the optimality is defined as a minimum mean squared error (MMS), in the AF, optimality is understood in the sense of MMS for the prediction output error (innovation). This definition allows to define the optimality of the filter in the realization space, but not in the probability space as done in the KF.
The optimal gain thus can be determined from solving the optimization problem by adjusting all elements of the filter gain. There are two major difficulties:
Instability: As the filter gain is estimated stochastically during the optimization process, the filter may become unstable due to the stochastic character of the filter gain.
Reduction of tuning parameters: For extreme HdS, the number of elements in the filter gain is still very high. Reduction of the number of tunning gain elements is necessary.
3.2. Leading EVs and SchVs as optimal perturbations
Interest on stability of the AF arises soon after the AF has been introduced. The study on the filter stability shows that it is possible to provide a filter stability when the system is
3.3. EVs as optimal perturbations in the invariant subspace
Let
The subspace of
It is seen that the optimal solution is the first EV
For a symmetric matrix, the EI-OP coincides with the SOP. The EI-OP is not unique.
By solving the optimization problem (8) s.t.
one finds the second EI-OP
for
In general, for a defective case (not diagonalizable),
To summarize, let the EV decomposition be
where
with
3.4. Dominant SchVs as OPs in the Schur invariant subspace
The study of Hoang et al. [8] shows that the subspace of projection of the stable filter can be constructed on the basis of all unstable EVs or SchVs of
Compared to the EVs, the approach based on real Schur decomposition is of preference in practice since the SchVs are real and orthonormal. Moreover, there exists a simple, power iterative algorithm for approaching the set of real leading SchVs. According to Theorem 7.3.1 in Golub and Van Loan [9], the subspace
4. Optimal stochastic perturbation (OSP)
In Section 2, the perturbation
We will consider now
where
All elements of
Introduce the set of RVS
under the constraint (19). One can prove
where
Introduce
and consider the objective function
By iteration, for
applying Lemma 4.3 with slight modifications, one finds that the OSP for
5. Simultaneous stochastic perturbations (SSP)
In [16], Spall proposes a simultaneous perturbation stochastic approximation (SPSA) algorithm for finding optimal unknown parameters by minimizing some objective function. The main feature of the simultaneous perturbation gradient approximation (SPGA) resides in the way to approximate the gradient vector (in average): a sample gradient vector is estimated by perturbing simultaneously all components of the unknown vector in a stochastic way. This method requires only two or three measurements of the objective function, regardless of the dimension of the vector of unknown parameters. In a study by Hoang and Baraille [17], the idea of the SPGA is described in detail, with a wide variety of applications in engineering domains. The application to estimation of ECM in the filtering problem is given in the work done by Hoang and Baraille [18]. In the research by Hoang and Baraille [19], a simple algorithm for estimating the elements of an unknown matrix as well as the way to decompose the estimated matrix into a product of two matrices, under the condition that only the matrix-vector product is accessible, has been proposed.
5.1. Theoretical background of SPGA: gradient approximation
The component-wise perturbation is a method for numerical computation of the cost function with respect to the vector of unknown parameters. It is based on the idea to perturb separately each component of the vector of parameters. For very HdS, this technique is impossible to implement. An alternative to the component-wise perturbation is the SSP approach.
Let
denote the gradient of
Suppose
where
Dividing both sides of the last equality by
Taking the mathematical expectation for both sides of the last equation yields
One sees that from the assumptions on
The result expressed by Eq. (28) constitutes a basis for approximating the gradient vector by simultaneous perturbation. The left of Eq. (28) can be easily approximated by noticing that for an ensemble of
where
Introduce the notations
Theorem 1 (Hoang and Baraille [19]) states that the estimate
5.2. Algorithm for estimation of an unknown matrix
Let
5.3. Operations with Φ and its transpose
Algorithm 5.1 allows to store
The product
Eq. (33) allows to perform
Similarly, computation of
5.4. Estimation of decomposition of Φ
Let
under the constraint
Under the condition (C), the optimization problem is formulated as
where
where
Theorem 3.1. Hoang and Baraille [19] implies that
5.4.1. Decomposition algorithms
Let the elements of
5.4.2. Iterative decomposition algorithm
Another way to decompose the matrix
Algorithm 5.2
At the beginning let
Its solution is denoted as
and stop. Otherwise, go to
From Theorem 3.2 of Hoang and Baraille [19], the couple
6. Numerical example
Consider the matrix
The singular values and the right SVs for
Element | ||||
---|---|---|---|---|
(1,1) | 5.00 | 4.677 | 4.914 | −0.237 |
(1,2) | 7.00 | 7.07 | 7.662 | −0.592 |
(2,1) | −2.00 | −2.041 | −1.08 | −0.962 |
(2,2) | −4.00 | −4.081 | −1.683 | −2.398 |
First, we apply Algorithm 5.1 to estimate the matrix
Next Algorithm 5.2 (Iterative Decomposition Algorithm) has been applied to estimate the decomposition of the matrix
The different OPs are shown in Table 2 where
Perturbations | Vector | Predictor | Amplification |
---|---|---|---|
9.676 | |||
0.62 | |||
3 | |||
2 | |||
9.487 | |||
2 | |||
9.22 | |||
3 | |||
9.674 | |||
9.472 |
Looking at the first OPs, one sees that
In Table 1, we show the results obtained by Algorithm 5.2 after two consecutive iterations (matrix estimation in
The estimates, resulting from the first iteration, are the elements of
7. Assimilation in high-dimensional ocean model MICOM
7.1. Ocean model MICOM
To see the impact of optimal SchVs in the design of filtering algorithm for HdS, in this section, we present the results of the experiment on the Hd ocean model MICOM (Miami Isopycnal Ocean Model). This numerical experiment is identical to that described in Hoang and Baraille [15]. The model configuration is a domain situated in the North Atlantic from 30°N to 60°N and 80°W to 44°W; for the exact model domain and some main features of the ocean current produced by the model, see and Baraille [15]. The system state
7.1.1. Data matrix based on dominant Sch-Ops
The filter is a reduced-order filter (ROF) with the variable
7.1.2. Data matrix based on SSP approach
The second data matrix
Figure 2 shows the evolution of estimates for the gain coefficients
The same pictures are obtained for the estimates
7.2. Performance of different filters
In Table 3, the performances of the three filters are displayed. The errors are the averaged (spatially and temporally) rms of PE for the SSH and for the two velocity components
rms | CHF ( | PEF (SCH) ( | PEF (RAN) ( |
---|---|---|---|
ssh(fcst) | 7.39 | 5.09 | 4.95 |
u(fcst) | 7.59 | 5.36 | 5.29 |
v(fcst) | 7.72 | 5.73 | 5.64 |
The results in Table 3 show that two filters PEF (SCH) and PEF (SSP) are practically of the same performance, and their estimates are much better compared to those of the CHF, with a slightly better performance for the PEF (SSP). We note that as the PEF (SCH) is constructed on the basis of an ensemble of samples tending to the first dominant SchV, its performance must be theoretically better than that of the PEF (SSP). The slightly better performance of PEF (SSP) (compared to that of PEF (SCH)) may be explained by the fact that the best theoretical performance of PEF (SCH) can be obtained only if the model is linear, stationary, and the number of PE samples in
To illustrate the efficiency of adaptation, in Figure 3, we show the cost functions (variances of innovation) resulting from the three filters PEF (SCH), PEF (SSP), and AF (i.e., APEF based on PEF (SCH); the same performance is observed for the AF based on PEF (SSP)). Undoubtedly, the adaptation allows to improve considerably the performances of nonadaptive filters.
8. Conclusion remarks
We have presented in this chapter the different types of OPs—deterministic, stochastic, or optimal, the invariant subspaces of the system dynamics. The ODPs and OSPs play an important role in the study on the predictability of the system dynamics as well as in construction of optimal OFS for environmental geophysical systems.
One other class of perturbation known as SSP is found to be a very efficient tool for solving optimization and estimation problems, especially with Hd matrices and in computing the optimal perturbations.
The numerical experiments presented in this chapter confirm the important role of the different types of OPs in the numerical study of Hd assimilation systems.
References
- 1.
Myhrvold N. Moore’s Law Corollary: Pixel Power. New York Times. June 7, 2006 [Retrieved: 2011-11-27] - 2.
Kalman RE. A new approach to linear filtering and prediction problems. Transactions of the ASME-Journal of Basic Engineering. 1960; 82 :35-45 - 3.
Lorenz EN. Deterministic non-periodic flow. Journal of the Atmospheric Sciences. 1963; 20 :130-141 - 4.
Lorenz EN. Atmospheric predictability experiments with a large numerical model. Tellus. 1982; 34 :505-513 - 5.
Palmer TN, Barkmeijer J, Buizza R, Petroliagis T. The ECMWF ensemble prediction system. Quarterly Journal of the Royal Meteorological Society. 1997; 4 (4):301-304 - 6.
Toth Z, Kalnay E. Ensemble forecasting at NMC: The generation of perturbations. Bulletin of the American Meteorological Society. 1993; 74 :2317-2330 - 7.
Hoang HS, De Mey P, Talagrand O, Baraille R. A new reduced-order adaptive filter for state estimation in high dimensional systems. Automatica. 1997; 33 :1475-1498 - 8.
Hoang HS, Talagrand O, Baraille R. On the design of a stable adaptive filter for high dimensional systems. Automatica. 2001; 37 :341-359 - 9.
Golub GH, Van Loan CF. Matrix Computations. 2nd ed. Baltimore-London: Johns Hopkin Press, 1993 - 10.
Hoang HS, Baraille R, Talagrand O. On the stability of a reduced-order filter based on dominant singular value decomposition of the systems dynamics. Automatica. 2009; 45 (10):2400-2405. DOI: 10.1016/j.automatica.2009.06.032 - 11.
Doucet A, Johansen AM. A tutorial on particle filtering and smoothing: fifteen years later (PDF). Technical report. Department of Statistics, University of British Columbia; 2008 - 12.
Evensen G. The ensemble Kalman filter: Theoretical formulation and practical implementation. Ocean Dynamics. 2003; 53 :343-367 - 13.
Kalnay E. Atmospheric Modeling, Data Assimilation and Predictability. Cambridge University Press; 2002 - 14.
Lanczos C. An iteration method for the solution of the eigenvalue problem of linear differential and integral operators. Journal of Research of the National Bureau of Standards. 1950; 45 :255-282 - 15.
Hoang HS, Baraille R. Prediction error sampling procedure based on dominant Schur decomposition. Application to state estimation in high dimensional oceanic model. Applied Mathematics and Computation. 2011; 218 (7):3689-3709. DOI: 10.1016/j.amc.2011.09.012 - 16.
Spall JC. Accelerated second-order stochastic optimization using only function measurements. In: Proceedings of 36th IEEE Conference on Decision and Control; 1997. pp. 1417-1424 - 17.
Hoang HS, Baraille R. Stochastic simultaneous perturbation as powerful method for state and parameter estimation in high dimensional systems. In: Baswell AR, editor. Advances in Mathematics Research. Vol. 20. New-York: Nova Science Publishers; 2015. pp. 117-148 - 18.
Hoang HS, Baraille R. On the efficient low cost procedure for estimation of high-dimensional prediction error covariance matrices. Automatica. 2017; 83 :317-330 - 19.
Hoang HS, Baraille R. A simple numerical method based simultaneous stochastic perturbation for estimation of high dimensional matrices. Multidimensional Systems and Signal Processing. 2018. DOI: 10.1007/s11045-018-0551-y - 20.
Daley R. The effect of serially correlated observation and model error on atmospheric data assimilation. Monthly Weather Review. 1992; 120 :165-177 - 21.
Cooper M, Haines K. Altimetric assimilation with water property conservation. Journal of Geophysical Research. 1996; 101 :1059-1077