Open access peer-reviewed chapter

Nonlinear Filtering of Weak Chaotic Signals

By Valeri Kontorovich, Zinaida Lovtchikova and Fernando Ramos- Alarcon

Submitted: April 20th 2017Reviewed: August 28th 2017Published: December 20th 2017

DOI: 10.5772/intechopen.70717

Abstract

In recent years, the application of nonlinear filtering for processing chaotic signals has become relevant. A common factor in all nonlinear filtering algorithms is that they operate in an instantaneous fashion, that is, at each cycle, a one moment of time magnitude of the signal of interest is processed. This operation regime yields good performance metrics, in terms of mean squared error (MSE) when the signal-to-noise ratio (SNR) is greater than one and shows moderate degradation for SNR values no smaller than −3 dB. Many practical applications require detection for smaller SNR values (weak signals). This chapter presents the theoretical tools and developments that allow nonlinear filtering of weak chaotic signals, avoiding the degradation of the MSE when the SNR is rather small. The innovation introduced through this approach is that the nonlinear filtering becomes multimoment, that is, the influence of more than one moment of time magnitudes is involved in the processing. Some other approaches are also presented.

Keywords

• nonlinear filtering
• chaotic systems
• Rossler attractor
• Lorenz attractor
• Chua attractor
• Kalman filter
• weak signals
• mean squared error

1. Introduction

The detection of chaotic (stochastic) weak signals is relevant (among others) for applications such as biomedical telemetry [1, 2], seismological signal processing [3], underwater signal processing [4], interference modeling [5], etc. Effective detection of weak and rather weak chaotic signals (−3 dB or less) is a challenge whose solution can improve, for example, the link budget (communication distance). Among different approaches to this problem, one can mention techniques such as stochastic resonance [4], instantaneous spectral cloning [6], etc. The problem in this chapter is addressed from the standpoint of nonlinear filtering techniques which earlier was designed to operate with signal-to-noise ratio (SNR) values bigger than one or at least rather close to one (with an acceptable slight degradation as the SNR approaches −3 dB [7]. Far down −3 dB, the performance of the available filtering methods drops down sharply and becomes ineffective. One of the possible explanations for this issue is that current nonlinear filtering algorithms can be considered as one moment in the sense that they operate in an instantaneous fashion, that is, during each operation cycle, they process an instantaneous one moment of time magnitude of the received aggregate signal; in the next cycle, a new instantaneous one moment of time magnitude is processed and so on. This is precisely the operation rule for all known optimum algorithms and their quasi-optimum versions as well, for instance, the extended Kalman filter (EKF) [7], but it can also be found in strategies such as unscented Kalman filter (UKF), Gauss-Hermite filter (GHF), and quadrature Kalman filter (QKF), among others. One of the goals of this chapter is to describe the detection of weak chaotic signals applying the principles of noninstantaneous filtering in a block way, that is, multimoment filtering theory [8], through a real-time implementation in a digital signal processing (DSP) block. Moreover, some space of this chapter will be dedicated to the conditionally optimum approach for the nonlinear filtering methods as well, together with some asymptotic methods.

Theoretically, for many cases, the chaos might be represented as an output signal of dissipative continuous dynamic systems (strange attractors) [9]:

ẋ=fxt,xRn,xt0=x0,E1

where f(•) = [f1(x), …fn(x)]T is a differentiable vector function.

According to the idea of Kolmogorov, the equations for strange attractors (1) can be successfully transformed in the equivalent stochastic form as a stochastic differential equation (SDE) [9, 10]:

ẋ=fxt+εξt.E2

The influence of a weak external source of white noise is denoted by ξ(t), and the noise intensities are given in a matrix form ε = [εij]nxn.

Note that a stationary distribution Wst(x) exists even when the weak white noise component is tending to zero [11, 12, 13].

Nonlinear filtering of chaotic desired signals comes up naturally when SDE (2) is used as model of chaos. This follows straight from the classical theory of nonlinear filtering for Markov processes, proposed more than 50 years ago [14, 15] and extensively developed in subsequent studies [16, 17, 18, 19, 20, 21], although those methods are still under development.

From the practical implementation point of view, the nonlinear filtering strategies are approximate (see the references above). This follows from the fact that, in general, there is no analytical solution for the a posteriori probability density functions when one attempts solving the Stratonovich-Kushner equations (SKE).

In the following, some of the numerous nonlinear filtering approximate approaches that have been developed will be presented.

2. Nonlinear filtering for Markovian processes

Let assume that filtering of the following received signal is required:

yt=stxt+n0t,E3

where s (⋅) is a vector function of the “message dependent” desired signal (which is subject of filtering) of dimension “m,” the received signal is denoted by the vector y(t) (also of dimension “m”), and n0 is a vector of the white additive noises characterized by the intensity matrix N0(m × m). The following SDE is used to model the signal s (⋅) as an n-dimensional Markov diffusion process [22]:

ẋ=gtx+ξt.E4

Strictly speaking, Eqs. (4) and (2) are the same SDE, and the vector function g (⋅) substitutes f (⋅) in (2); for (4), D denotes the correspondent matrix of intensities for ξ(⋅).

Under this assumption ([14, 22] and so on), one can use the so-called Fokker-Planck-Kolmogorov (FPK) equation in order to solve the a priori probability density function (a priori PDF), for x(t):

WPRxtt=i=1nxigitxWPR(xt)+12i=1nj=1n2xixjDijWPRxt,E5

where WPR(x,t0) = W0(x).

The Eq. (5) can be rewritten in another form [21, 23] as well:

WPRxtt=divπxt,E6

or

WPRxtt=LPRWPRxt,E7

where π(x, t) is a probabilistic “flow” with components:

πxt=gixtWPRxt12j=1nxjDijWPRxt.E8

In Eqs. (5)(8), {Dij} denote diffusion coefficients of the Markov process and gixt1nare the correspondent drift coefficients, and both of them will be used in the Stratonovich sense [14, 22]; LPR{⋅} denotes a FPK linear operator.

The integrodifferential equation for the a posteriori probability density function WPS(x, t) is given by any of the two equivalent expressions (see [14]:

WPSxtt=LPRWPSxt+12FxtF(xt)WPS(xt)dxWPSxtE9

or

WPSxtt=divπ̂xt+12Fxt<F(xt)>WPSxt,E10

where <Fxt>denotes the averaging of F(x,t) given by <Fxt>=FxtWPSxtdx, π̂xtis (5), WPR(x, t) is substituted by WPS(x, t), and

Fxt=yt12s(xt)TN01yt12s(xt).E11

The combination of Eqs. (9)(11) is known as the Stratonovich-Kushner nonlinear equations (SKE), and they have an appealing physical sense: the first term in (9) represents the dynamics of the a priori data of x(t). For the second term, the analysis of observations is used to drive the innovation of the a priori data.

Using any optimization criteria, one can get x̂t(the optimum estimation of x(t)) which comes as a solution of (9), when y(t) is the input signal, that is, filtering of x(t).

Here, one has to note that Eq. (9) turns into FPK (6) if the intensity of the additive noises N0 grows (the first term in (9) is dominant), and as a consequence, the filtering accuracy diminishes drastically. In the opposed scenario (large signal-to-noise ratio), the WPS(x, t) tends to the unimodal Gaussian PDF [14, 20].

Note that the time evolution of WPS(x, t) is completely described by the SKE but, as it was mentioned earlier, does not provide exact analytical solutions. There are very few exceptions: linear SDE (4) which yields the well-known Kalman filtering algorithm [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24], the Zakai approach [25], and so on. Due to this, the nonlinear filtering algorithms are practically always approximate. As it was mentioned before, during almost 50 years of intensive research, the bibliography for nonlinear filtering algorithms has become enormous; in the next section, we will consider only few of those works taking into account the following considerations:

• the models applied for filtering of chaos correspond to the equations for Rössler, Chua, and Lorenz strange attractors with n = 3, that is, low dimensional;

• the algorithms for nonlinear filtering have to be of reduced computational complexity in order to satisfy real-time application requirements;

• the algorithms for nonlinear filtering, according to the aim of the material of the chapter, have to be able to perform satisfactorily in scenarios with low or very low signal-to-noise ratios (SNR), although the Gaussian assumption for WPS(x) is not always valid;

• sxttxt;E12

• All Dij are equal to zero, except D11 ≅ D1 [11].

• 2.1. Approximate approaches for nonlinear filtering

For the sake of simplicity, it is “easier” to approximate the a posteriori PDF WPS(x, t) than the nonlinearity at (4) and (9) [16, 17, 19]. In this sense, let us just list some of the approximate approaches for WPS(x, t):

• Integral or global approximations for WPS(x, t) [20];

• Functional approximations for WPS(x, t) [16, 21];

• Higher Order Statistics (HOS) approximations for WPS(x, t), and so on;

• Gaussian approximations: extended Kalman filter (EKF) [14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24]; unscented Kalman filter (UKF) [19]; quadrature Kalman filter (QKF) [17]; iterated Kalman filter (IKF), etc.

It is hardly feasible to give a complete overview of all those methods; moreover, not all of them are adequate, taking into account the observations introduced at the end of the previous section.

Let us start with the extended Kalman filter (EKF): considering WPS(x, t) as a three-dimensional Gaussian PDF- ŴGxt, from (9), it is possible to obtain the following equations for per-component of the mean estimates x̂i13and for estimates of the elements of the a posteriori covariance matrix R̂iji,j=13:

where xi=xix̂iand xj=xjx̂j.

The matrix form [14, 15, 16, 20] can be used to represent Eq. (13); however, for some specific applications, per-component representation (13) could be more adequate (see the following).

It is reasonable to assume convergence to the stationary values Rij¯for R̂ijtwhen t→∞, and as a result, the second equation in (13) can be expressed as a system of nonlinear algebraic equations, with standard numerical solutions. This consideration is relevant for real-time scenarios, as it significantly simplifies the implementation of the related EKF algorithms.

Functional approximation for WPS(x, t) is, as it was described in [16, 21],

WPSxt=i=13WPSxi1+q=23j=1q1RqjRqqRjixqx̂qxjx̂j.E14

From (14), we see that the functional approximation for the PDF is sufficiently non-Gaussian (marginal WPS(xi) is arbitrary), but for “joint” characterization of the vector x̂, only elements of the a posteriori covariance matrix R̂ijare considered.

It can be shown that the equations for x̂i1nand R̂ijcoincide with those in (13), and the unique difference would be that one has to apply in (13) the approximation for WPS(x, t) instead of ŴGxt. The resulting integrals can be solved either through the Gauss-Hermit quadrature formula [17, 18] or analytically.

The integral or Global approximation for WPS(x, t) is another approach for approximate solution. Maybe the experienced reader already noticed that the last two approximations for WPS(x, t) can be considered as “local” as they offer maximum of WPS(x, t), estimation of x̂i, and R̂ij.

For conditions of significantly large SNR, this is sufficient, but for low SNR, one has to find a different approach, known as integral approximation. This strategy was suggested as an adequate approximation of WPS(x, t) together with the PDF’s “tails,” that is, for the whole span of x.

Let us suppose that WPS(x, t) can be characterized as:

WPSxt=WPSxαt.E15

Here α is an unknown vector of approximation parameters. As an approximation criterion for PDF, it is possible to use the Kullback measure; thus, one might obtain the following equation for the unknown vector α:

α̇=LPR+hxt+V1thxtF(xt),E16

where hxt=lnWPSxαtα, Vt=lnWPSxαtαTWPSxαtdx=2WPSxαtααΤ, LPR+is a self-adjoint operator to the FPK operator [22].

Now, as an integral approximation of WPS(x, α(t), let us choose the so-called “Dynkin PDF” with α(t) is the vector of sufficient statistics for WPS(⋅):

WPSxαt=Cexpp=1Kαptϕpx+ϕ0x,E17

where {φp(x)} are orthogonal multidimensional operators: Laguerre, Hermite, and so on.

One can notice that there is a significant coincidence between (17) and the orthogonal series characterization of WPS(x, α(t) [22]: even though both apply series of orthogonal functions, in (17), it is not used for WPS(x, α(t)) but for its monotonical transform ln{WPS(x, α(t)}. So, the coefficients {αp(t)} can be expressed by means of the cumulants of WPS(x) [22]. Thanks to this, instead of searching for a solution of (17), hardly possible in an analytically way, one can search directly equations for the cumulants (HOS) of WPS(x, t) [16, 26].

Here, the HOS approach will be presented because the last problem was addressed in the cited references. It is worth noticing the following: for real-time scenarios when n > 1, equations for HOS and Eq. (16) are significantly complex; for n = 1, both strategies are equivalent [26].

3. Multimoment filtering of chaos

As it follows from the material of Section 2, all the algorithms are “one-moment” in the sense that they are operating only with the data at each time instant, that is, they are tracking instantaneously one moment magnitude of the received aggregate signal. As it was shown at [27], the adequate filtering algorithm (for the one-moment case) is an Extended Kalman Filter (EKF).

This choice is more or less expected, due to the experience which is already known from the available references (see above). EKF shows rather good performance for the filtering of chaotic signals: the mean squared error (MSE) is less than 1% when SNR is about −3 dB, and for SNR bigger than −3 dB, the results are much better.

In this regard, a question arises: is it possible to improve this approach in the sense of getting still rather good MSE’s for successively lower thresholds of the SNR with an algorithm of reasonable complexity? The following material attempts to prove that the answer is “yes,” if one can apply some additional information from the received aggregate signal taken on several sequential time instants.

It means that the information has to be considered in the block manner by aggregating data, in our case, for several time instants ([8, 16, 27], and so on.). The difference between the following approach and that from the cited references is precisely the aggregated data obtained for many time instants: multimoment algorithms are carried out through the generalization of the Stratonovich-Kushner equations (SKE) for the corresponding multimoment data, and therefore, in the following, all heuristics for the simplification, considered as Generalized SKE (GSKE), are not arbitrary but can be taken as generalized heuristics from the “standard” one-moment SKE (see below). This gives a “hope” to achieve the abovementioned improvement for the SNR threshold with less complex tools.

It follows from the fact that, as it was shown in [8] (see also the references therein), the GSKE comes from the same structure as its one-moment prototype. So the way of its simplification (except for the limiting of the number of time instants) in order to get a quasi-optimum algorithm, could be done in a similar way as for the one-moment case: approximation of the a posteriori PDF (characteristic function) in SKE with a minimum set of significant parameters. Moreover, there is an additional way to improve the accuracy of the quasi-optimum solution for the GSKE: assume this quasi-optimum algorithm as a “given structure,” as it was proposed in [16] and also considered in the following.

3.1. Generalization of SKE for the multimoment case

In the same way, as it was underlined earlier, the chaos is “generated” by the equation:

ẋ=fx,xRn,xt0=x0,E18

where f(•) = [f1(x), …fn(x)]T is a differentiable vector function and it can be considered as a degenerated Markov process from the following stochastic equation:

ẋ=fx+εξt,E19

where ξ(t) is a vector of “weak” external white noise with the related positively defined matrix of “intensities” ε = [εij]n×n.

In the following, one can consider both the ordinary differential equation (ODE) (18) and the stochastic differential equation (SDE) (19) when the noise intensities tend equally to zero. Adding the ε term in (19) guarantees the existence of a stationary PDF for x(t) as well, no matter how small the elements of ε might be [28]. So, one can suppose that this stationary PDF, WST(x), is known a priori. For our case in practical sense, one can deal actually only with the stationary PDF, which we assumed is modeled by means of a chaotic process (concretely let us say the first component, x1(t), of certain attractor model). Certainly WST(x1) can easily be obtained from WST(x). If the two PDFs coincide in terms of certain fitness criteria, then only for simplicity in the subsequent developments, the SDE (19) can be substituted by its statistically equivalent one-dimensional SDE with the same WST(x1):

ẋ1=fx1+εξt,E20

where fx1=ε2ddx1lnWSTx1and ε in (20) can be considered here as a “scale factor” and can be chosen by equalizing the average powers of real x1(t) and solution of (20). Formally, there is no need for all those operations, but then the reader has to be extremely concentrated with “multiindex” definitions: one index for the number of applied components of the attractor and another index for the time instant, that is, xm(ti), which might cause confusion in further developments, as x1(t) is an observable component whose dynamics depends on other “nonobservable” components. For those reasons, in the following, (20) will be considered as a model of the desired signal for filtering.

Let us introduce the following notation for the time instants (time moments): t1 < t2 < t3 … <tn and xi = x(ti), i=1,n¯. Then, xti1nforms a vector x(t) = [x(t1),…, x(tn)]T and Wn(x, t) ≅ Wn(x1, …xn; t1, …tn); Wn(x,t) is an a priori PDF for x(t). As it follows from ([16], ch. 5):

Wnxtti=LiWnxtE21

where Li=xiK1xi+122xi2K2xiis the FPK operator [16] with K1(xi) = f1(xi), K2(xi) = ε2. It is easy to show that by consecutive differentiation one can obtain:

nWnxtt1tn=i=1nLiWnxt,E22
LPR=i=1nLi.E23

Certainly, the adjoint operator [16, 22] for the multimoment case is:

L̂PR=i=1nL̂i,E24

where L̂i=K1xixi+K2xi22xi2is a Kolmogorov operator [22].

Let us then introduce the a posteriori n-dimensional PDF Wps(y|x,t) ≅ Wps(x,t) for the multimoment case. Then, repeating formally the development for the SKE, but in this case generalized for the “n” time case (in the same way as it was done at [27]), one can get:

nWpsxtt1tn=LPRWpsxt+12FxtRnF(xt)Wps(xt)dxWpsxtE25

with t = [t1, …, tn]T,

Fxt=yt12xtTN0yt12xt,E26

where y(t) = [y(t1), …, y(tn)]T is the vector of xti1ntaken from y(t) = x(t) + n(t) and n(t) is the AWGN with intensity N0.

Analyzing (25) by comparing it with the standard form of the SKE (see Eqs. (9) and (10) in part II), one can see that there is a total “structural” identity! The same matter takes place for the a posteriori cumulants [16, 27], that is:

κr1,rnpstt1tn=jkkλ1r1,λnrnML̂expjλTxF(xt)MexpjλTxλ=0+MexpjλTxF(xt)MexpjλTxλ=0E27

where r1+ r2 + … + rn = k, k = 1, 2, ….

One can see from (25) and (26) that those algorithms are rather complex for implementation in real-time regime. So, in addition to the one-moment SKE, they have to be modified in order to get the quasi-optimum solution.

3.2. Quasi-optimum solutions. Generalized EKF

One has to know that “quasi-optimum” solutions (for any problem) are based on some heuristics and those heuristics have to be reasonable and based on previous experience in solving similar problems. In the case of multimoment filtering, the analogies can be the following (of course implicit considerations for complexity have always to be taken into account):

• The priority will be given to the quasi-linear approximation for nonlinear functions in the same way as it was assumed for the “standard” one-moment filtering.

• All algorithms for block processing show that there is in some sense a reasonable block length for the processed data. Taking into account the complexity limits and that the covariance function of the chaos initially drops rather fast [29], let us take first n = 2.

• The approximation of the a posteriori PDF (characteristic function) has to apply the minimum set of first cumulants; one has to remind that, as the order of cumulants grows, their significance for PDF approximation vanishes [22];

Taking these observations into account, let us take n = 2, that is, two-moment filtering case, then [16, 22]:

θpsλexps=12jss!r1,r2Sκst1t2λr1λr2,E28

and cumulants are:

κr1,rnpst=jkkλ1r1,λnrnlnθpsλλ=0.

Another assumption is that the a posteriori process is supposed to be stationary; then, the one-moment cumulants for t1 and t2 have to be the same and the only mutual cumulant taken into account might be κ11(t1, t2). Next, for each moment “t1” and “t2” one-moment cumulants can be calculated applying Gaussian approximation for the a posteriori PDF, and for the two-moment case, the “functional approximation” could be applied. In a rigorous sense, the a posteriori variance κ2pshas to be evaluated as κ2pst1t2, considering the covariance among time instants “t1” and “t2”; in the following, the heuristic strategy will be introduced, which avoids the cumbersome calculations.

One can obtain the first two-moment cumulants:

κ̇1=<K1x>+12<xFxt>κ12<xFxt>κ̇2=<2xK1x>κ1<K1x>+ε+12<xκ12Fxt>κ22<xFxt>,E29

where < > is a symbol for the averaging procedure, Fxt=1N0yt12xt, and K1(x) is the drift coefficient for (19).

One has to notice that at (29) κ1(t) is an estimation of the filtered signal (in our case, it is a chaotic signal); κ2(t) is a measure of the filtering accuracy. As it can be is seen from (29), those equations were written without any intention for linearization, that is, they are presented in a generalized form. For the quasi-linear algorithms, it is well known [27] that κ2(t)/N0 is the main part for the “averaging coefficient” of the second element in the first quasi-linear equation of (29), that is, it is an averaging value for the instantaneous information actualization from the entering desired signal plus noise. Thus, if one can reduce κ2 through the two-moment processing, the accuracy of the quasi-linear method will grow and the challenge stated before will be almost solved. To achieve the latter, one can take into account that κ2(t) in the stationary regime is oscillating around its stationary value κ2t¯=limtκ2twhich is commonly assumed as an accuracy measure in the one-moment case.

The value of κ2t¯can be diminished applying the information from κ11(t) also in the stationary case, that is, κ¯11=limt1,t2κ11t1t2; then, it is known that κ¯̂2=κ¯21κ¯112and it is always less than κ¯2, if and only if the κ¯110; In this way, κ¯̂2can be used as a new weighting coefficient in (29). To find κ11(t1,t2) from (27), some cumbersome developments are required which finally yield to:

κ11t1t2t1t2=<K1x1K1x2>+<x1x22Fxt>κ11t1t22<Fxt>E30

and

κ¯11=2<x1x2Fxt2+2K1x1K1x2><Fxt>.E31

First we would like to stress here that, as we are interested in covariance calculation, it is necessary to preserve the notations x(t1) = x1 and x(t2) = x2. Second, we want to “improve” the stationary value κ¯2evaluated for the one-moment case through its indirect dependence on κ¯11as if it was “evaluated” for the two-moment case.

Thus in doing so, the direct calculation of the quasi-linear algorithm for the two-moment case is bypassed (see (29) and (30)). For applications in real time, the formal calculus is almost impossible. Instead, we simplified it with a formal “ignorance” of the two-moment features. There might be for sure a compromise between the complexity and the improvement attempt for the “classic” EKF.

In order to avoid some additional complexities for the calculation of (31), let us make the following assumption: introduce the SNR of the filtering in the way: h2=κ¯12N0<<1, that is, weak signal case. In this regard [16, 27], the a priori data are the main influence, that is, approximately only <K1(x1) K1(x2) > can be applied. Or one can simply apply a Gaussian approximation for the second equation in (29) for the stationary regime (κ̇20)

2K1'κ1¯+κ¯11κ¯22+ε+14F''κ1¯κ¯22=0.E32

In the case h2 < 1, it is possible to achieve:

κ¯212K'κ1¯+κ¯11,E33

and if κ¯11> 0, and K′1) ≥ 0, κ2 is always reduced compared with the one-moment approach. Formula (33) can be seen as another illustration about the usefulness of the heuristic approximation proposed above. Then, to evaluate the order of the κ¯11, let us apply for averaging of <K1(x1) K1(x2)> the functional approximation of Wps(x1, x2) in the way:

Wpsx1x2=12πκ¯2expx1κ122κ¯2expx2κ122κ¯21+κ¯11x1κ1x2κ1.E34

As an approximate result, one can substitute (34) in (33), assume h2 < 1 and see that the normalized value κ¯11has the same order as h2, that is, κ¯11∼ O(h2). This is an important consideration because usually the pure chaos has a low covariance interval [29] and one can obtain a very small MSE for two time instants t1 and t2 arbitrarily close. In this sense and fixing SNR ∼ 0.5 and MSE ∼ 0.1%, an equivalent MSE can be reached using the two-moment approach but with an SNR threshold 30% lower than for the one-moment case. Let us be emphatic and say that the approximation κ¯11∼ O(h2) is valid just for h2 < 1, and calculation of κ¯̂2κ¯21κ¯112has to be updated instantaneously because h2 is varying in the interval 0 ≤ h2 < 1.

Of course this calculation is quite approximated and true superiority for the two-moment case of the modified quasi-linear strategy has to be verified by computer experiments. Anyway it is a strong sign indicating that the use of the two-moment strategy can be very opportunistic if and only if one can find strategies to reduce the computational complexity, for example, the generalized extended Kalman filter (GEKF) algorithm.

Finally, let us reiterate that the GEKF is yet a one-moment strategy for quasi-optimum filtering, but internally makes processing of the statistical features of the chaotic data (input) through the multimoment (two-moment) apparatus. That is why this modified GEKF improved accuracy in comparison with the standard EKF. In the following in order to additionally improve the accuracy of this one-moment modified EKF, it is convenient to apply the principles of the theory of so-called “conditionally optimum filtering” proposed in ([16], ch. 9), taking this generalized EKF as the “tolerance” or “admitted” filter.

4. Conditionally optimum filtering approach

The ideas and methods for conditionally optimum filtering are rather simple and are thoroughly described at ([16], ch. 9). So, let us first present the basic idea of this method. In the general case, the conditional optimum filter for the optimum estimation of the desired signal x(t) in presence of AWGN n(t) can be presented in the form [16]:

κ̇1=αξyκ1t+βηyκ1tyt,E35

where κ1(t) is a filtered signal; y(t) = x(t) + n(t); n(t) is the AWGN with intensity N0; α, β are some time-dependent coefficients which have to be found.

The representation (35) is a generalized representation of the filtering algorithms where κ̇1is the expectation of the filtered signal. It is clear as well [16] that this form is valid also for the quasi-optimum nonlinear filtering algorithms. In the previous part, a modified EKF algorithm was proposed for the two-time-moment case, which shows rather opportunistic improvement of the filtering accuracy, applying some heuristics related to the simplified implementation of the two-moment principle of filtering. Sure those simplifications do not allow taking full advantage of the application of the two-moment principle. Once again, this simplification is reasonable for diminishing the dimension of the filtering algorithm in order to make it practical for real-time applications. Therefore, the hope for further improvement of the characteristics of this modified EKF might be based on further optimization in the framework of conditional optimality [16].

In the theory of conditional optimality, the structure of the filter is already chosen (in our case, it is the GEKF) and the only chance for further accuracy improvement is to optimize the coefficients α(t) and β(t) in order to minimize the MSE. The structure which was chosen initially is a so-called admitted structure which actually belongs to a class of the admitted filters. The next step is to minimize the MSE. The minimization of the MSE is a strategy in which the admitted filter makes an optimal transition at the moment “s” (s > t, s → t) from an initial stage, at moment “t,” to a new stage at the moment “s” with the minimum MSE. The algorithm of such kind of filter is “conditionally optimum” according to Ref. [16].

Hereafter we are not going to present all the material related to this approach as it was comprehensively described at ([16], ch. 9), we will only apply the necessary final formulas from there. Unfortunately, full use of the abovementioned approach is not possible (as we will see in the following), and so, we will present some developments that allow to obtain the coefficients α(t) and β(t) successfully.

4.1. Approach to find unknown coefficients α(t) and β(t)

It is possible to present an admitted structure of the conditionally optimum filter from (29) in two equivalent forms:

κ̇1=αK1κ1+κ¯̂22K1''κ1+βκ¯̂2N0ytκ1tE36
κ̇1=αK1κ1+κ¯̂22K1''κ1κ¯̂2κ1N0+βκ¯̂2N0yt,E37

where, as it was proposed earlier,

κ¯̂2=κ¯21κ¯112.E38

Then, from (36) and (37), one has

ξt=K1κ1+κ¯̂22K1''κ1,ηt=κ¯̂2N0ytκ1tE39
ξt=K1κ1+κ¯̂22K1''κ1κ¯̂2κ1N0,ηt=κ¯̂2N0yt.E40

One can see that in this regard, α and β are weighting coefficients of a priori information related to the desired chaotic signal and a posteriori data. This issue was thoroughly commented in [27]. For SNR < 1, the weight of ξ(t) obviously prevails, because a posteriori data are strongly corrupted by the additive noise. Nevertheless, taking into account that κ¯̂2is rather small for the modified EKF, in the following, κ¯̂2(which is actually the MSE) will be considered as a “small parameter” in all the approximations.

In order to follow all definitions and notations from ([16], ch. 9), one has to use the Ito form in all the equations:

dy=Xdt+dW1=ϕ1yxtdt+ψ1yxtdW1dx=fxdt+dW2=ϕ1xtdt+ψxtdW2,E41

where {Wi(t)} are independent Wiener processes, i = 1, 2. It is obvious that:

φ1xt=fx=κ1xφ1yxt=xψ1yx=1ψxt=1E42

Then, from ([16], ch. 9)

x̂sx̂tκsκt=αξtΔt+βηtϕ1tΔt+ϕ1tΔW.E43

Unbiased conditions for the optimum estimation from (43) are [16]:

α<ξt>+<ηtϕ1t><ϕ1>=0.E44

Taking ξt and ηt according to its definitions from (40), it is easy to get from (44):

αm1+βm2=m0,E45

where m0 = <φt>, m1 = <ξt>, m2 =  < ηtφ1t>.

Taking into account (42) with conditions κ¯̂2 < 1 and assuming that K11) ≈ K1´´(κ1) ≈ 01, finally one gets:

βα=κ2<x2>.E46

The next step, as it was proposed in ([16], ch. 9), is focused on checking the correlation conditions for the error (κsxs) with the vector [ξΔt, ηΔy] which yields to [16]:

β=κ02κ221,E47

where

κ02=<xtκ1txtκ¯̂2N0yt>+<ηtyt>κ¯̂2N0,κ22=κ¯̂2N02<yt2ηt>.E48

From the second equation in (48), it follows that β → ∞ which is a clear absurd. So, why this happened and what is wrong? Is the approach in ([16], ch. 9) wrong? Definitively, no. It is possible to show that the estimate κ1 is unbiased and decorrelated with both components ξ(t) and η(t), but for our special case, the condition that κ22 (a matrix in the general case) has to be invertible is violated. Opposed as it was stated in ([16], ch. 9), the approach is not working.

The solution might be found from direct calculation of (x−κ1) from the SDE of chaos and (29) and by minimization of <(x-κ1)2 > by α or β.

4.2. Direct evaluation of the MSE and its minimization

As a first step, let us calculate the difference between the solution of (20) and (39) by applying (46):

xκ1=0TK1xαK1κ1ακ1κ¯̂2nt<x2>N0dt.E49

Let us take the second power of (49) and make a statistical average. One has to notice that the second power of (49) is a double integral and <n(t1) n(t2)> = N0δ(t2t1). Then, applying finally the assumption κ¯̂2< 1, one can get for the MSE:

MSE<K12x>+α2<K12κ1>2α<K1xK1κ1>+α2κ¯̂2<x2>N0.E50

Looking for the minimum of (50) in terms of “α”, one easily finds:

α=<K1κ1K1x><K12κ1>+κ¯̂2<x2>.E51

Assuming that still κ¯̂2is a “small parameter,” it follows that α ≈ 1 and βκ1<x2>O1κ1. In this regard,

MSEκ¯̂22<x2>.E52

Comparing Eq. (52) with the MSE of the one-moment filtering which is κ2, one can see that the conditional optimum filtering might significantly improve the MSE with the same SNR or significantly diminish the SNR threshold for a fixed MSE.

The authors consider that the two-moment filtering of chaos together with the conditionally optimum principle is a very opportunistic approach to significantly improve the MSE for chaos filtering.

Notes

• This assumption follows from symmetry conditions for f (x).

More

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Cite this chapter Copy to clipboard

Valeri Kontorovich, Zinaida Lovtchikova and Fernando Ramos- Alarcon (December 20th 2017). Nonlinear Filtering of Weak Chaotic Signals, Chaos Theory, Kais A. Mohamedamen Al Naimee, IntechOpen, DOI: 10.5772/intechopen.70717. Available from:

chapter statistics

1Crossref citations

Related Content

Next chapter

Chaos on Set-Valued Dynamics and Control Sets

By Heriberto Román-Flores and Víctor Ayala

Numerical Simulations of Physical and Engineering Processes

Edited by Jan Awrejcewicz

First chapter

Numerical Solution of Many-Body Wave Scattering Problem for Small Particles and Creating Materials with Desired Refraction Coefficient

By M. I. Andriychuk and A. G. Ramm

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.