Open access peer-reviewed chapter - ONLINE FIRST

# Adaptive Filter as Efficient Tool for Data Assimilation under Uncertainties

By Hong Son Hoang and Remy Baraille

Submitted: November 27th 2019Reviewed: March 19th 2020Published: July 10th 2020

DOI: 10.5772/intechopen.92194

## Abstract

In this contribution, the problem of data assimilation as state estimation for dynamical systems under uncertainties is addressed. This emphasize is put on high-dimensional systems context. Major difficulties in the design of data assimilation algorithms is a concern for computational resources (computational power and memory) and uncertainties (system parameters, statistics of model, and observational errors). The idea of the adaptive filter will be given in detail to see how it is possible to overcome uncertainties as well as to explain the main principle and tools for implementation of the adaptive filter for complex dynamical systems. Simple numerical examples are given to illustrate the principal differences of the AF with the Kalman filter and other methods. The simulation results are presented to compare the performance of the adaptive filter with the Kalman filter.

### Keywords

• innovation process
• minimum prediction error
• simultaneous perturbation stochastic approximation
• filter stability

## 1. Introduction

In this chapter, the adaptive filter (AF) is considered as a computational device that yields estimates of the system state by minimizing recursively (in time) the error between the predicted output of the device and its observed signal in real time. As the main objective of the AF is to produce estimates of the state in high-dimensional systems (HdSs), we shall focus the attention on the mathematical form of the AF in a state-space form as that used in the Kalman filter (KF) [1]. In this chapter, the HdS is referred to as a system whose state dimension is of order O107O108.

The assimilation problem in this chapter is formulated as a standard filtering problem. For simplicity, let the dynamical system be described by the equation

xt+1=Φxt+wt,x0=x0,t=0,1,E1

where xtis the system state at the ttime instant. At each time instant t, we are given the observation for the system output

zt+1=Hxt+1+vt+1,t=0,1,E2

In (1) and (2), wtis the model error (ME), vtis the observation error (ObE), and Φrepresents the system dynamics. In general, the system (1) and (2) may be nonlinear with Φx=fx, Hx=hx. The filtering problem for a partial observed dynamical system (1) and (2) is to obtain the best possible estimate for the state xtat each instant t, given the set of observations Z1:t=z1zt.

There exist different techniques to solve estimation problems. The simplest approach is related to linear estimator [2], since it requires only first two moments. Linear estimation is frequently used in practice when there is a limitation in computational complexity. Among others, the widely used methods are maximum likelihood, least squares, method of moments, the Bayesian estimation, minimum mean square error (MMSE), etc. For more details, see [3].

There are limitations of optimal filters. In practice, the difficulties are numerous: the statistics of signals which may not be available or cannot be accurately estimated; there may not be available time for statistical estimation (real-time); the signals and systems may be non-stationary; memory required and computational load may be prohibitive. All these difficulties become insurmountable, especially for HdSs.

In order to deal with real-time applications, the AFs appear to be a valuable tool in solving estimation problems when there is no time for statistical estimation and when we are dealing with non-stationary signals and/or systems environment. They can operate satisfactorily in unknown and possibly time-varying environments without user intervention. They improve their performance during operation by learning statistics from current signal observations. Finally, they can track variations in the signal operating environment [4].

It is well-known that the MMSE estimator in the class of Borel measurable (with respect to (wrt) Z1:t) functions is given by the conditional mean

x̂t=Ext/Z1:tE3

Under standard conditions, related to the noise sequences wt,vt(Gaussian i.i.d.—identically independent (temporal) distributed), the estimate (3)x̂tfor xtcan be obtained from the KF in the recursive form

x̂t+1=x̂t+1/t+t+1,E4
x̂t+1/t=Φx̂t,ζt+1=zt+1x̂t+1/tE5
Kt=MtHTHMtHT+R1E6
Mt+1=ΦPtΦT+QE7
Pt=IKtHMtIKtHT+KtRKTtE8

In (4)(8), Q,Rare the covariance matrix for wtand vt, respectively. One sees that K=KM—the gain matrix, is a function of MMt+1—the error covariance matrix (ECM) for the state prediction error (PE) et+1/twhich is defined as et+1/txt+1x̂t+1, and ζt+1is known as innovation vector. Note from (7) and (8) that the ECM Mt+1can be found by solving the matrix nonlinear Algebraic Riccati equation (ARE). Generally speaking, a solution of the ARE is not unique. Conditions must be introduced for ensuring an existence of a unique non-negative definite solution [5]. It is remarkable that the ECMs P,Min (7) and (8) do not depend on observations; therefore, they can be computed in advance, offline, given the system matrices and the noise covariances. The same remark is valid for the gain matrix Kin (6). In contrast, the gain in the AF is observation-dependent [6] (see Section 2).

Under the most favorable conditions (perfect knowledge of all system parameters and noise statistics), for a dynamical system with dimension of order 107108, it is impossible to solve the ARE (due to computational burden), not to say about storing M,P. To overcome these difficulties, the AF is proposed. Mention that the KF is also an MMSE filter in the complete Hilbert space of random variables, Eξ2<, with scalar product ξ1ξ2H=E<ξ1ξ2>and the norm ξ=Eξ21/2.

For the nonlinear models, there are KF variants, among those are the extended KF (EKF) [7], the unscented KF (UKF, [8]), and the Ensemble Kalman filter (EnKF, [9]). In the EnKF, the ECM is a sampled ECM whose samples are generated using samples of the state variable, and consequently the ECM in the KF becomes a sampled ECM. For an example of application of the EnKF for data assimilation in geophysical data assimilation with high dimensional model, see [10]. Another class of ensemble filtering technique is a class of particle filters (PF, [11]). The basic idea of the PF (also the EnKF) is to use a discrete set of weighted nparticles to represent the distribution of xt, where the distribution is updated at each time by changing the particle weights according to their likelihoods.

Despite a possible implementation of the KF variants, they might still be seriously biased because the accuracy of the KF update requires linearity of the observation function and Gaussianity of the distribution of system state xt. In reality, the KF (4)(8) may be biased and unstable, even divergent [12]. Today, the PF algorithms are ineffective for HdS data assimilation.

In this chapter, we shall show how the AF can be efficient in dealing with uncertainties existing in the filtering problem (1) and (2). In Section 2, a brief outline of the AF is given. The main features of the AF, which are different to those of the KF, are presented. This concerns the optimal criteria, stabilizing gain structure, optimization algorithms. Section 3 shows in detail how the AF is capable of dealing with uncertainties in the specification of ME covariance. The hypothesis on a subspace of ME is presented in Section 4 from which one sees how one can make order reduction for representing the bias and ME covariance. Simple numerical examples on one- and two-dimensional systems are given in Section 5 to illustrate in details the differences between the AF and the traditional KF. Numerical experiments on low and high dimensional systems are given in Section 6 to demonstrate how the AF algorithm works. The performance comparison between the AF and KF, for both situations of perfect knowledge of ME statistics and that with ME uncertainties, is also presented. Conclusions and perspectives of the AF are summarized in Section 7.

The AF is originated from [13]. It is constructed for estimating the state of a dynamical system based on partially observed quantities related in some way to the system state. As reported before, for linear systems contaminated by Gaussian noise, the MMSE estimate can be obtained by the KF. Since publication of [1] in 1960, an uncountable number of works are done for solving engineering problems by KF, in all engineering fields, as well as many modifications have been proposed. The reasons for the need in modification of the KF are numerous, but mostly related to nonlinear dynamics, parameter uncertainties in specification of system parameters, bias of ME, unknown statistics of ME, model reduction. With the rapid progress of computer technology (computer power, memory, …), various simplified versions of KF are suggested for solving filtering (or data assimilation) for HdSs, in particular, in meteorology and oceanography.

Direct application of the KF to HdSs is impossible due to the limit in computer power, memory, and computational time. In particular, the KF requires to solve the matrix AREs (7) and (8) for computing ECMs Mt,Pt. Storing such matrices is impossible, not to say on computational time.

Different simplified approaches are proposed for overcoming difficulties in the application of the KF. The example of successful tool for solving data assimilation problems in HdSs is the EnKF [9]. In the EnKF, an ensemble of error samples, of small size, is generated on the basis of model states to approximate the ECMs. In practice of data assimilation for HdSs, it is possible to generate only ensembles of moderate sizes (of order O100) by model integrations over the assimilation window (time interval between two successive arrivals of observations) since one such integration takes several hours! The other approach like PF is based on sampling from conditional distributions. Theoretically, this approach is more appropriate for nonlinear problems because no linearization is required as in the EKF (Extended KF based on linearization technique). However, even for filtering problems with state dimensions of order O10, relatively large ensembles of size O10000would be required in order to produce reasonably good performance.

The AF in [13] is based on the different idea. Here, no linearization is required for nonlinear filtering problems. For the problem (1) and (2), the filter is of the form (4) and (5) but the gain K=Kθis assumed to be of a given stabilizing structure [6]. It means that Kis parametrized by some vector of unknown parameters θΘso that the filter (4) and (5) with the gain Kθ,θΘis stable. It is well-known that under mild conditions, the solution of the ARE will tend (quickly) to stationary solution Mand so the gain (6), to the stationary gain K. Moreover, the innovation ζt+1=zt+1ẑt+1/t, representing the error for the output predictionẑt+1/tHx̂t+1/t=HΦx̂t, is unbiased and of minimum variance. This fact leads to the idea to seek the optimal vector θby minimizing

JθEΨζargminθE9

here E.denotes the average in a probabilistic sense. For stationary systems (1) and (2), if we assume the validity of the ergodic hypothesis, the average value in a probabilistic sense, expressed in (7), is almost everywhere equivalent to the time average (for large time of running the dynamical system). The optimal θcan be found by solving the equation

θJθ=θEΨζ=EθΨζ=0E10

A stochastic approximation (SA) algorithm for solving (10) can be written out

θt+1=θtγtθtΨζt+1E11

Conditions related to the sequence of positive scalar γtfor ensuring a convergence θtin the procedure (11) are

γt>0,t=0γt=,t=0γ2t<E12

One of the most advantages of the SA algorithm (11) is that, instead of computing the gradient of the cost function (9) (which requires knowledge of probability distribution), the algorithm (11) is based on the knowledge of only the gradient of sample cost function Ψ(wrt to θ) which can be easily evaluated numerically.

Comment 2.1. Generally speaking, the convergence rate of the algorithm (11) and (12) is O1/t. It is possible to improve the convergence rate of the SA by averaging of the iterates,

θmt=1tt=1tθtE13

For more details, see [14].

Comment 2.2. For high HdSs, even with θbeing of moderate dimension, instead of the algorithm (12) or (13), the SPSA (Simultaneous Perturbation Stochastic Perturbation) algorithms in [15, 16] are of preference. That is due to the fact that integration of HdS over the assimilation window is very expensive. These algorithms generate stochastic perturbation δθ=δθ1δθnTwith components as Bernoulli i.i.d. realizations. Each ithcomponent of the gradient-like (pseudo-gradient) vector is computed as the divided difference δΨ/δθiwhere ΨΨθ+δθΨθ. This allows to evaluate the gradient-like vector by only two or three time integration of the numerical model.

For details on the SPSA algorithm and its convergence rate, see [15, 16].

## 3. Covariance uncertainties and AF

### 3.1 Covariance uncertainties

As seen from (11) and (12), implementation of SA algorithms is much simpler for searching optimal gain parameters compared to the other optimization methods. The SA algorithms require only numerical computing derivatives θΨζt+1of the sample cost function Ψζt+1wrt θevaluated at θtand γtis a scalar which can be chosen a priori, for example, as γt=1t. That is possible due to introducing the ergodic hypothesis on of the system (1) and (2) from which there exists an asymptotic optimal gain

First, consider the situation when the vector of parameters consists of are all elements of K,θ=K.Compute the innovation vector,

ζt+1=zt+1ẑt+1/t=zt+1HΦx̂t
x̂t=x̂t1/t+Kθζt
δθζt+1=δθx̂t=δθKθζt=δKζt.
δKΨζt+1=δK<ζt+1,ζt+1>=2<ζt+1,δKζt+1>
=2<ζt+1,δKζt>=2<ΦTHTζt+1,δKζt>.

Let us compute derivatives of the sample cost function Ψwrt the elements Kijof the gain K. To do so, one needs to integrate the adjoint operator ΦTs.t. the forcing fHTζt+1which yields ψΦTHTζt+1and hence

δΨ/δKij=ψiζtj,E14

here ψiis theithcomponent of ψ,ζtjthe jthcomponent of ζt. The AF now takes the form

x̂t+1=x̂t+1/t+t+1,E15
x̂t+1/t=Φx̂t,ζt+1=zt+1Hx̂t+1/t,E16
Kt+1=KtγtKΨζt+1,E17

where KΨζt+1is the gradient vector whose components are computed by (14). In the AF (15)(17), no matrix ARE (see (7) and (8) in the KF) is involved. The AF (15)(17) is quite realizable for HdSs, since at each assimilation instant we need to integrate only the direct model to produce the forecast (16) and (eventually) an adjoint model over the assimilation window for computing KΨζt+1[13].

#### 3.1.2 Simultaneous perturbation stochastic approximation (SPSA) approach

Remark that in the form (14) the adjoint operator ΦTwould be available to implement the AF. It is well-known that construction of numerical code for ΦTis a very difficult and heavy task, especially for meteorological and oceanic numerical models which are HdSs and nonlinear (linearization is required).

A comparison study of the AF with other assimilation methods is done in [17]. Compared to the AF, the widely used variational method (VM) minimizes the distance between the observations available (for example, the observations of the whole set Z1:T) and the outputs of the dynamical system. This optimization problem is carried out in the phase space, hence is very difficult and expensive. Theoretically, a simplification is possible subject to (s.t.) the condition of linearity of the dynamical system: in this case, one can reformulate the VM minimization problem as searching the best estimate for the initial system state x0. For HdSs, to ensure a merely high quality estimate for x0, it is necessary: (i) to take the observation window as large as possible; (ii) to parameterize the initial state by some parameters (using a slow manifold, for example). Iterative minimization procedures require usually O10iterates involving integrating the direct and adjoint models over the window 1T. For an unstable dynamics, integration of direct and adjoint equations over a long period naturally amplify the initial errors during assimilation process. For a more detailed comparison between the AF and VM, see [17].

Thus, if the ergodic conditions hold, there exists an optimal stationary gain and the AF in limit will approach to the optimal one in the given class of stable filters. It is important to emphasize that up to this point, no covariance matrices Q,Rare specified. It means that the AF in the form (12) is robust to uncertainties in the specification of the covariances of the ME and ObE.

### 3.2 Stability of the AF

One of the main features of the AF is related to its stability.

For simplicity of presentation, in the previous section, the AF algorithm is written out under the assumption (13). In practice, application of the AF in the form (13) is not recommended since instability may occur. It is easy to see that the transition matrix of the filter is given by

L=IKHΦE18

It is evident that if we do not take care on the structure of K, varying stochastically all elements of Kcan lead to instability of Land the filter will be exploded. Moreover, for HdSs, the number of elements of Kis very large. It is therefore primordial to choose a parametrized stabilizing structure for K(depending on θ) to ensure a stability of Land reducing a number of tuning parameters. This question is addressed in [6]. One of possible structures for Kis of the form

Kt=PrΘKe,KeMeHeTHeMeHeT+αI1,HeHPrE19

where PrRnxris a matrix with dimensions n×r, ris the dimension of the reduced space (equal to the number of unstable eigenvectors (EiVecs)of Φ), the matrix Meis a strictly positive symmetric definitive playing the role of the ECM in the reduced space Re, Θis a diagonal matrix with diagonal elements θiwhose values belong to ϵ2ϵ, i.e. θiϵ2ϵ, with ϵ01whose value depends on the modulus of the first stable eigenvalue (EiV) [6]. We will refer to the filter s.t. (19) with Θ=Id(Idis the identity matrix of appropriate dimension) as a nonadaptive filter (NAF). In the AF, the parameters θiare adjusted each time when a new observation arrives, to minimize the cost function (9). Thus θiis a time-varying function. As to the matrix Pr, its choice is important to ensure a filter stability. One simple and efficient procedure (called Prediction Error Sampling Procedure—PeSP) to generate Pris to use the power orthogonal iteration method [18] which allows to compute real leading Schur vectors (SchVecs) of Φi. The advantage of using the SchVecs compared to the EiVecs, is that they are real and their computation is stable. It is seen that the optimal AF is found in a class of stable filters which is stable even for an unstable numerical model. As to the VM, the optimal trajectory is found on the basis of only the numerical model with the initial state as a control vector. It means that for unstable dynamics, the errors in the forcing or numerical errors arising during computations will be amplified and lead to large estimation error growth. More seriously, the VM requires a large set of observations and large number of iterations (i.e., many forward and backward integrations of the direct and adjoint models) which naturally leads to increase of estimation error too.

### 3.3 On improving the initial gain

Consider the gain structure (19). Suppose that Mehas been chosen in agreement with the required stability conditions. Before tuning the parameters θito minimize the cost function (9), remark that stability of the filter is still ensured for the following gain:

Kt=PrΛΘKe,E20

where Λ=diagλ1λr,λk01since then for Γ=ΛΘ=diagγ1γrit implies γt=λtθtϵ2ϵ, where ϵ01. Writing the equation for the filtered error (FE) eftx̂txtone sees that the matrix Lin (18) also represents the transition of the FE eft. It means that it is possible to choose a more optimal initial gain by solving, for example, the minimization problem

JoΛ=LΛ22minΛE21

The problem (21) is solved without using the observations, hence it is offline. Once the optimal Λhas been found, the standard AF is implemented s.t. the filter gain

Kt=PrΘKe,KeΛKeE22

It is seen that using the structure (22) this optimization procedure does not require the information on the ME statistics.

## 4. Joint estimation of state and model error in AF

The previous section shows how the AF is designed to deal with the difficulty in specification of covariances of the ME and ObE. This is done without exploiting a possibility to determine, more or less correctly, a subspace for the ME. If such a subspace can be determined without major difficulties, it would be beneficial for better estimating the AF gain and improving the filter performance. In [19], the hypothesis of the structure of the ME has been introduced and a number of experiments have been successfully conducted.

There is a long history of joint estimation of state and ME for filtering algorithms, in particular, with the bias and covariance estimation. One of the most original approaches, dealing with the treatment of bias in recursive filtering (known as bias-separated estimation—BSE), is carried out by Friedland in [20]. He has shown that the MMSE state estimator for a linear dynamical system augmented with bias states can be decomposed into three parts: (1) bias-free state estimator; (2) bias estimator; and (3) blender. This BSE approach has the advantage that it requires fewer numerical operations than the traditional augmented-state implementation and avoids numerical ill-conditioning compared to the case of bias-separated estimation by filtering technique.

It is common to treat the bias as part of the system state and then estimate the bias as well as the system state. There are two types of ME—deterministic (DME) and stochastic (SME). Generally speaking, a suitable equation can be introduced for the ME. In the presence of bias, under the assumption on constant b, instead of (1) one has

xt+1=Φxt+bt+wt,bt+1=bt,t=0,1,2E23

To introduce a subspace for the variables wt,btthe SME and DME in (23), let

w=Gwρ,b=Gbd
GwRn×nw,GbRn×nd,nnw,nnbE24

Generally speaking, Gw,Gbare unknown, and finding reasonable hypothesizes for them is desirable but not self-evident. In [19], one hypothesis for Gw,Gbhas been introduced (it will be referred to as Hypothesis on model error—HME).

The information on Gw,Gb, given in (25), allows to better estimate the DME band SME wfor improving the filter performance, especially for nb< n, nw< n in a HdS setting. The difficulty, encountered in practice of operational forecasting systems, is that (practically) nothing is given a priori on the space of the ME values. To overcome this difficulty, one simple hypothesis has been introduced in [19]. This hypothesis is postulated by taking into consideration the fact that for a large number of data assimilation problems in HdSs, the model time step δt(chosen for ensuring a stability of numerical scheme and for guaranteeing a high precision of the discrete solution) is much smaller than Δt—the assimilation window (time interval between two successive observation arrivals).

Suppose that Δt=naδtwhere nais a positive integer number.

Hypothesis (on the subspace of ME—HME) [19]. Under the condition that nais relatively large, the ME belongs to the subspace spanned by all unstable and neutral EiVecs (or SchVecs) of the system dynamics Φ.

## 5. Simple numerical examples

### 5.1 One-dimensional system

To see the difference between the AF and the KF in doing with ME uncertainties, introduce the one-dimensional system

xt+1=Φxt+wt,zt+1=hxt+1+vt+1,t=0,1,E25

In (25), Φis the unique eigenvalue (also the singular value) of the system dynamics.

1. For simplicity, let Φ=1,h=1. This corresponds to the situation when the system is neutrally stable. The filter fundamental matrix (18) now is LK=1Kwhich is stable if K02. For the KF gain (4)(8), as Kkft=MkftMkft+Rwe have Kkft01, Mkftis the solution of (7). That is true for any Mkft0, R>0. It means then the KF is stable. Mention that if Q>0always Mkft>0. In general, Kkft=MkftMkft+R+where A+is the pseudo-inverse of A [21].

For the AF, we have in this case Pr=1.Consider the gain KafθPrθKe, where Keis the gain of the form Ke=MeMe+R, Me>0, R>0, Meis constant. We have then for the NAF (θ=1)0<Ke< 1 and Knaf=Ke.

For the AF, the transition matrix (18) reads Lafθ=1θKe. For θ02, Lkfθ01, Kafθ02and the AF is stable. It is evident that there is a larger margin for varying the gain in the AF than that in the KF since Kkft01. One sees that the stationary KF is a member of the class of stable AFs (19). The performance of AF is optimized by solving the problem (9) using the procedure (11) and (12) or SPSA algorithms (Comment 2.2).

2. Let Φ<1, i.e., the system (1) is stable. The results in (i) are valid for the AF structure. In this situation, the filter is stable even for Kaf=0.

3. Let Φ>1—the system (1) is unstable. Consider two situations (a) Φ>1; (b) Φ<1. As before Pr=1.

For Φ>1we have

Φ1KeΦ<θ<Φ+1KeΦE26

In particular, when Φ1, approximately θ02Ke. When QR(that is usually in practice), approximately θ02as in the situation (i). For large Φ1, Φ1KeΦ1Ke(left-hand limit), Φ+1KeΦ1Ke(right-hand limit) and there remains no margin for varying θ(or QR) and Kaf1. It is important to emphasize that as Keis chosen by designer, we can define the interval for varying θif the amplitude of Φis more or less known. In practice, one can vary θϵ2+ϵwith small ϵ>0for Φclose to 1, and with ϵclose to 1 for large Φ.

For Φ<1we have

Φ+1KoΦ<θ<Φ1KoΦE27

It is seen from (27) that when Φ1, approximately θ02Ke. As for the situation Φ1, Φ+1KoΦ1Ko(left-hand limit), Φ1KoΦ1Ko(right-hand limit) when QRapproximately θ1Kohence Kaf1.

It is important to stress that the KF gain is computed on the basis of Qand R(under the condition that the statistics of the initial state will be forgotten as tbecomes large); whereas, the gain of the AF is updated on the basis of samples of the innovation vector. It means that the KF is optimal in the MMSE sense (under the condition of exact knowledge of the required statistics) whereas the AF is optimized during the assimilation process using PE realizations of the system output (innovation vector). The KF gain can be computed in an offline fashion, whereas the AF gain is a function of observation and computed in online.

### 5.2 Two-dimensional system: specification of covariances

#### 5.2.1 Stable filter

To see the role of the correction subspace RPrin ensuring a stability of the AF, let us consider the system (1) and (2) s.t.

Φ=diagΦ11Φ22,H=diag11E28

Consider the AF with the gain (19) s.t. Pr=Id, i.e., two columns of Prare in fact the EiVecs associated with two EiVs Φ11and Φ22. Let us denote the AF gain Kaf=PrΘMHTHMHT+R1=ΘKewith KeMeHTHMeHT+R1which is structurally identical to that of the KF. For the nonadaptive filter Knaf=Keand with the choice Me=diagM11M22, taking into account (28) one gets

Knaf=diagK11K22,Kii=MiiMii+Ri,i=1,2E29
Lnaf=diagl11l22,lii=1MiiMii+RiΦ/ii,i=1,2E30

The filter transition matrix (30) is obtained on the basis of Lnaf=IKnafHΦand the assumption (29). It is easily to see that Lnafhas two EiVs, λi=lii,i=1,2where lijest. the ijelement of Lnaf.

Stability of the filter depends on the condition lii<1, i=1,2. For Mii>0,Ri>0, if Φiiis stable or neutrally stable, i.e. Φii1,i=1,2, we have lii<1. For unstableΦii(Φii>1,i=1,2), the filter is unstable if Φii>Mii+RiRi(situation Φii>1) or Φii<Mii+RiRi(situation Φii<1). These conditions should be taken into account when the EiVs of Φare large.

For the AF gain (19) (Pr=I),

lii=1θiMiiMii+RiΦii,E31

From (31) conditions for lii<1can be obtained as done in Section 5.1 with the one-dimensional system since lii,i=1,2are independent one from another. The length of the interval Iifor varying θidepends on the value of Φii(see (26)).

This example shows that for Pr=Id, it is always possible to construct a stable AF whatever are the EiVs of Φ(stable or unstable). There are some constraints for Mii(they are positive) and for Ri(small positive). Optimality of the AF is obtained by searching recursively (in time) the optimal θiduring assimilation process. Thus, in the AF, a correct specification of ME and ObE statistics (second order) is not important as happens in the KF.

#### 5.2.2 Unstable filter

Consider the situation when Pris constructed from only one vector. Let Pr=10T—the EiVec associated with Φ11(the results remain the same if we choose Pr=01T—the EiVec associated with Φ22).

Φ=diagΦ11Φ22,Φ11>1,Φ22>1,H=diag11E32

We show now that the filter with the gain (19) is unstable. We have (for Θ=Id),

He=HPr=Pr,K=PrKe,Ke=MeHeTHeMeHeT+R1=MePrPrMePrT+αI1E33

if we put R=αI. For Lnaf=IKHΦ, taking into account (32), (33) it implies

Lnaf=αΦ11me+α00Φ22E34

As αme+αcan be made as small as desired by choosing small α>0, the first EiV l11=αΦ11me+αcan be made stable. However, the second EiV in (34)l22=Φ22>1is unstable. It implies that the filter with the gain (19) s.t. Pr=10Tis unstable. This happens even for ΘId. It means that when the projection subspace RPrdoes not contain all unstable and neutral EiVecs of the system dynamics, it is impossible to guarantee a stability of the filter.

### 5.3 Two-dimensional system: estimation of ME

Consider the filtering problem (1) and (2), the dynamical system (1) describes a sequence of system states at time instants t=0,1,when the observations are available. It means that Φrepresents the transition of system state over the (observation) time window Δtseparating arrivals of two successive observations. In practice, the interval Δtis much larger than the model time stepδtwhich is the step size in approximating the temporal derivative. The choice of δtis important for guaranteeing a stability of discretized scheme and having high is important for guaranteeing a stability of discretized scheme and having high precision of the discretized solution (wrt the continuous solution). We have then ΔT=naδt, where nais a relatively large positive integer. For example, in the HYCOM model at SHOM (French marine) for the Bay of Biscay configuration, the interval Δtbetween two observation arrivals is 7 days which is equivalent to integrating 1200 model time steps δt. It means na=1200. Symbolically we have then the equations for model time step integration

xτ+1=Φxτ+ψτ,ψτbτ+wτE35

In (35), Φrepresents the integration of numerical model over one model time step δt. Hence

Φ=ΦnaE36

The contribution of ψτ,over the assimilation window t1t(for simplicity and without loss of generality, one supposes t10,tn_a) is

ψt=bt+wt,btτ=0naν1τ1,ν1τΦna1τbτ,
wtτ=0naν2τ,ν2τΦna1τwτ,Φ10E37

The HME in Section 4 says that the SME wtand DME bt, as functions of na, belong to the subspace spanned by leading EiVs (or SchVecs) of Φfor a relatively large na. The initial filtering problem now has the form (1) and (2) s.t. (36) and (37) where t=τ/na.

To illustrate this HME, continue the two-dimensional system in Section 5.2.2 and suppose that Φ11>1Φ22<1. Applying HME in this case is equivalent to saying that the values of MEs bt,wt, approximately, belong to the subspace Ru1spanned by the first EiVec u1=col10, associated with the EiV Φ11. Here y=coly1yndenotes the vector-column with components y1,,yn. It follows that the covariance matrix of wtis assumed to be of the form Q=σw2u1u1Tand bt—of the structure bt=cu1, cis a scalar to be estimated. For the algorithm of joint estimation of state and bias (in term of c), see [19].

## 6. Simulation results

### 6.1 One-dimensional system

In this section, the filtering problem (25) in Section 5.1 is considered s.t.

Φ=0.99,H=1.02,Q=0.09,R=0.01.

The true system states and observations are simulated using the initial state x0=1and wt,vtare zero mean Gaussian mutually uncorrelated and temporal uncorrelated sequences.

To see the performance of the AF, unknown system states are estimated on the basis of the AF algorithm. To obtain a reference, the standard KF is also implemented for solving this filtering problem. In the filtering algorithms, the estimate of the initial state is x̂0=2.The gain Knafin the NAF is taken as that of the KF at t=0, i.e., Knaf=Kkf0.

Figure 1 shows the temporal evolution of the parameters θmtduring assimilation process.

The gains in the KF and AF during the assimilation process are displayed in Figure 2. Mention that the KF gain is computed s.t. true statistics Q,R. In the AF, θmthas been used for computation of the AF gain, i.e., Kaf=θmtK. From Figure 2, one sees that initialized by the same value, the two gains become different during assimilation process. The KF gain has reached a stationary regime very quickly.

The mean temporal RMS (root mean square) of the innovation is shown in Figure 3. It is interesting to remark that no significant difference is observed between two curves and a slightly better performance is produced by the KF.

In Figure 4, we show RMS of the state FE produced by the KF and AF under the condition that the variance Qis known exactly. One sees that the KF, as expected, produces the best results.

Figure 5 shows the RMS of FE as a function of the variance Q. Here, the value of Qvaries from 0.1 to 1.9. Note that the true value of Qis 0.1. The red curve represents the RMS of FE produced by the KF at the end of the assimilation period (as a function of Q). The green curve has the same meaning, but for the FE produced by the AF. It is interesting to note that when Qis correctly specified, the KF behaves better than the AF, but misspecification of Qleads to growing of the error in the KF. The AF is robust wrt the error in the specification of Q. This fact says in favor of the AF as an efficient tool for overcoming uncertainties in the ME.

### 6.2 Two dimensional system

#### 6.2.1 Illustration of hypothesis HME

According to the notations in Section 5.3, consider the two-dimensional system (1) with Φ=Φij,Φ11=1.02,Φ12=0.1,Φ21=0,Φ22=0.9,H=11with the true DME bτ=col0.1,0.1.Thus the first EiV is unstable, the second—stable [19].

Numerically one finds that the first SchVec is equal to u1=17.0E7T.

Figure 6 [19] shows the simulation results obtained on the basis of (37). One sees that, for na>10, the second component of wtis close to 0whereas the first component becomes bigger and bigger (in absolute value) as naincreases. Here, wτis a sequence of independent two-dimensional Gaussian random vectors of zero mean and variance 1. This means that the values of wtbecome more and more close to the subspace Ru1spanned by u1, hence the HME is practically valid for na>10in this example. Mention that, as a rule, in ocean numerical models, nais of order o(100) (na=800or the MICOM model in the experiment in Section 6.3). See also [22].

#### 6.2.2 Assimilation results

First simulation of the sequence of true system states xτ,τ=1,,390has been carried out (see (35)). The observations are picked at τ=15,30,,390. In terms of xt, the filtering problem then is of the form (1) and (2) s.t. t=τ15,Φ=Φ15(if no bias exists). When there is a bias, instead of wtstands ψtbt+wt.

In the experiment, the true system states xtare generated by (35) s.t. bτ=col0.1,0.1,wtis zero mean with the covariance Q=Id. The observation error is of zero mean and covariance R=0.16. For the state xtin the KF and NAF, the forecast is obtained at each assimilation instant tas x̂t+1/t=Φx̂t+b̂t. The simulation yields bt=0.22962.0589E02which results from applying (37) s.t. bτ=col0.1,0.1.

Figure 7 depicts the time evolution of the KF and AF gains. One sees here as in the experiment with 1D system (Figure 2) that the KF gain is stabilized very quickly compared to that of the AF gain.

Figure 8 (from [19]) shows the sample time average RMS of the state FE produced by the three filters NAF, KF, and AF. One sees that the AF outperforms the NAF and KF.

### 6.3 Data assimilation in the high-dimensional ocean model

To illustrate the effectiveness of the AF in dealing with uncertainties in HdSs, this section presents the results on data assimilation in the oceanic numerical model MICOM (Miami Isopycnic Coordinate Ocean Model) [19]. This MICOM describes the oceanic circulation in the North Atlantics. The model has four vertical layers with the state consisting of three variables x=huvwhere his layer thickness, uvare two velocity components. The horizontal grid is 140×180. Totally at each time instant twe have the state xtof dimension 302400140×180×4layers×3variables. For more details on the configuration of this experiment, see [22].

The experiment is carried out on estimating the oceanic circulation using sea surface height (SSH) measurements. The SSH observation is available each 7 days (ds) (hence the observation window ΔT=7ds). Mention that simulating the circulation over 7 ds requires 800model time steps δtintegration.

#### 6.3.1 AF with optimal initial gain

First, in order to examine whether the method of optimal gain initialization, described in Section 3.2, is really useful for improving the filter performance, the optimization problem (21) has been solved. Symbolically, in the gain (20), Pr=IdQGTTwhere Idis the identity operator on the space of layer thickness h, QGis the quasi-geostrophy operator computing the correction for velocity using the SSH innovation ζ. The gain Kecomputes the correction for husing ζ. The ECM M=MvMh—Kronecker product of Mh—ECM of horizontal variable, Mv—ECM with vertical variable (see below for details). The two parameter matrices Θand Λare related to parameterization of Mv. The problem (21) is solved s.t. Θ=Id—identity operator.

The optimal parameters λi,i=1,,4are found by solving the minimization problem (21) using SPSA algorithm. Figure 9 shows the averaged values (see Comment 2.1) of λi,i=1,,4resulting during the optimization process. All λi,i=1,,4are initialized as λi=1,i=1,,4.

The two NAFs are performed, one (denoted as NAFI) is with the gain (20) s.t. Θ=Id, Λ=Id, and the other (denoted by NAFOI) s.t. Θ=Idand λi,i=1,,4obtained by solving (21) (their values are those displayed at the end of the optimization process in Figure 9). The performances of these two NAFs are shown in Figure 10. One sees here that the NAFOI has improved considerably the quality of estimates of the velocity u-component compared with the NAFI. This result justifies that offline optimization (21) is an interesting strategy for finding the optimal initial gain in the NAF.

#### 6.3.2 Estimating the ECM of ME

In practice, for real operational systems, information on the space of ME is not available or very poorly known. Usually, there is a big difference between the model and the real physical process and if the ME statistics are taken more or less properly, in some way, in the filtering algorithm, one can improve the filter performance and reduce the estimation error.

This idea is tested here by applying the HME in Section 4. We carry out the procedure for estimating the ECM of the ME by first constructing the subspace for the ME. For more details on the structure of the ECM Min the AF, see [23]. According to [23], the ECM Mis assumed to be of the structure M=MvMh—the Kronecker product of Mhwith Mvwhere Mhis the ECM of the horizontal variable, Mv—ECM with vertical variable. Figure 11 displays RMS of FE for the uvelocity component at the surface resulting from two AFs. The curve AF0Ucorresponds to the AF whose nonadaptive version has the gain computed on the basis of the ECM Musing an ensemble of PE samples (generated by the PeSP in [18]). The curve AF3Ushows the performance of the AF with the modified ECM (by adding the vertical ME covariance Qvto the vertical ECM Mv). More precisely, Qvis assumed to belong to the subspace spanned by three leading EiVs of Mv. This choice is justified by the fact: the eigenvalue decomposition of Mvhas the first three EiVs with the explained variances 67, 17, 15%, respectively. As the fourth EiVec has only the explained variance 0.7E-07%, it is dropped from the subspace constructed for the vertical ME. The better performance of the AF3U, in comparison with that of the AF0U, is apparently seen in Figure 11.

The above experiment shows in details how, on the basis of HME, the subspace for the ME can be constructed, and how one estimates the ECM for the model error. The superior performance of the AF3Uover that of the AF0Uvalidates the usefulness of the HME which can serve as an important tool for estimating the ME and improving the performance of the AF for solving the data assimilation problems with HdSs.

## 7. Conclusions

One of the key assumptions to ensure the optimal performance of the KF is that a priori knowledge of the system model is given without any uncertainty. This assumption, however, is never valid in practice for dynamical systems under consideration. The uncertainties exist everywhere in modeling a real process like structural uncertainty, model parameterization, model resolution, model bias or ME statistics. For HdSs, order reduction introduced either in the original numerical model or in the filtering algorithms, inevitably leads to uncertainty in the ME, especially in geophysical numerical models.

Our focus in this chapter is to show how the AF solves efficiently filtering problems for systems operating in an uncertain environment.

As seen from this chapter, the AF has proven to be efficient to deal with uncertainties in the specification of the ME statistics, system bias or model reduction. The reasons of the success of the AF are that (i) it belongs to the class of parametrized stable filters; (ii) it is defined as the best member minimizing mean PE for the system outputs; (iii) The tuning parameters are chosen as elements of stabilizing gain and they are of no physical sense.

It is obvious from this chapter that the performance of the AF is comparable with that of the KF when perfect knowledge of all ME statistics is given, and it outperforms the KF in presence of uncertainties. This happens since the AF acquires knowledge during assimilation process, regardless of uncertainties existing in the filtering problems. From the computational point of view, implementation of the AF consumes much less memory and computational time than the KF or other assimilation methods.

Simple numerical examples and simulation results, presented in Sections 5 and 6, clearly demonstrate the advantages gained through application of the AF in dealing with uncertainties. These positive results encourage a wide application of the AF in different fields of technology and applied sciences like automatic control, finance, aerospace, space exploration, meteorology, and oceanography. A more in-depth and significant research on the capacity of the AF to deal with uncertainties is surely a challenge for the near future.

## How to cite and reference

### Cite this chapter Copy to clipboard

Hong Son Hoang and Remy Baraille (July 10th 2020). Adaptive Filter as Efficient Tool for Data Assimilation under Uncertainties [Online First], IntechOpen, DOI: 10.5772/intechopen.92194. Available from: