Open access peer-reviewed chapter

Distributed Kalman Filter

By Felix Govaers

Submitted: April 20th 2017Reviewed: October 25th 2017Published: December 20th 2017

DOI: 10.5772/intechopen.71941

Downloaded: 1121


The continuing trend toward connected sensors (“internet of things” and” ubiquitous computing”) drives a demand for powerful distributed estimation methodologies. In tracking applications, the distributed Kalman filter (DKF) provides an optimal solution under Kalman filter conditions. The optimal solution in terms of the estimation accuracy is also achieved by a centralized fusion algorithm, which receives all associated measurements. However, the centralized approach requires full communication of all measurements at each time step, whereas the DKF works at arbitrary communication rates since the calculation is fully distributed. A more recent methodology is based on ”accumulated state density” (ASD), which augments the states from multiple time instants to overcome spatial cross-correlations. This chapter explains the challenges in distributed tracking. Then, possible solutions are derived, which include the DKF and ASD approach.


  • distributed Kalman filter
  • target tracking
  • multisensor fusion
  • information fusion
  • accumulated state density

1. Introduction

Modern tracking and surveillance system development is increasingly driving technological trends and algorithm developments toward networks of dislocated sensors. Besides classical target tracking, many other applications can be found, for instance, in robotics, manufacturing, health care, and financial economics. A multisensor network can exploit spatial diversity to compensate for the weak attributes of a single sensor such as limited field of view or high measurement noise. Also, heterogeneous sensors can reveal a more complete situational awareness and a more precise estimate of the underlying phenomena. Additionally, a sensor network is more robust against a single point of failure in comparison to a standalone system, if its architecture is chosen carefully.

Despite its unquestioned advantages, a multisensor network must cope with design-specific challenges, for instance, a full transmission of all the measurements to a fusion center can be hindered, when the communication links are unreliable or constrained in bandwidth or costs. A well-known approach to cope with the limited bandwidth data links is to apply data processing on the sensor sites to generate local track parameters that are transmitted to the fusion center. The latter then reconstructs the global track parameters by an application of a Track-to-Track Fusion(T2TF) scheme. Depending on the scenario constraints, this is a nontrivial task, since the local tracks are mutually correlated due to the common process noise. The first known solution in the literature to the T2TF problem proposed to apply an information filter-based multisensor fusion algorithm in [1], which later became famous as the “tracklet fusion.” However, the tracklet fusion also requires a transmission from each sensor after each time step in order to reconstruct the optimal solution.

This chapter presents the theory and the derivation of the distributed Kalman filter(DKF), which is an optimal solution of the T2TF problem under Kalman filter assumptions with respect to the mean squared error(MSE). Assuming Kalman conditions means that linear Gaussian models are provided for the motion model and all measurement models of the sensors. Moreover, it is assumed that measurement-to-track (at the sensors) and track-to-track (at the fusion center) association has been solved. The DKF approach requires, however, all remote sensor models to be known at each local sensor site, which is infeasible in most practical scenarios. Therefore, this chapter also presents a solution based on the accumulated state density(ASD), which is closely related to the DKF but does not require the measurement models to be known. Surveys that reflect the history of research in distributed estimation can be found, for instance, in [2, 3].

This chapter is structured as follows: Section 2 summarizes the problem formulation. A basic approach to T2TF is given in Section 3, where we present the least squares solution. Section 4 presents a simple fusion methodology, which is easy to compute and provides practical results for various applications. The reason why this approach is suboptimal is investigated in Section 5 by means of a recursive computation of the cross-covariances of the local tracks. In Section 6, the product representation of Gaussian probability densities is introduced, which is the basis for the derivation of the distributed Kalman filter in Section 7. An alternative derivation in terms of information parameters is provided in Section 8. Since the local measurement models are often unknown in practical applications, the distributed accumulated state density filter is introduced in Section 9. The chapter concludes with Section 10.


2. Problem formulation

Throughout this chapter, it is assumed that all Ssensors produce their measurements zksRmat each time step tkof the same target with its true state xkRnin a synchronized way. The extension to the unsynchronized case is straightforward by means of standard Kalman filter predictions, and is therefore omitted for the sake of notational simplicity. The measurement equation for sensor s1Sis given by


where vksΝvks0Rksis the Gaussian distributed, zero-mean random variable, which models the noise of the sensing process. Therefore, the likelihood for a single measurement is fully described by the Gaussian


Since the measurement processes across all sensors Zk=zk1zkSare mutually independent, the joint likelihood of all sensor data produced at time tkfactorizes:


The true state of the target itself is modeled as a time-variant stochastic process, where the transition from time tk1to time tkis given by the following motion equation:


where wkΝwk0Qkis the Gaussian distributed, zero-mean random variable to model the process noise of the system. Analogously to the likelihood, this provides the probability density function for the transition model:


Based on the local processors, each sensor node produces a trackat time tkin terms of an estimate xkksand a corresponding estimation error covariance Pkks. In a T2TF scheme, these parameters are the only information, which is transmitted to a fusion center(FC). It should be noted that the FC may also not be centralized, distinguished instance in the architecture, but each and every processing node can act as a FC. The introduction of a distinguished FC is only for clarification of different computation layers. An excellent overview of pros and cons of various data fusion layers can be found in [4].

The T2TF problem can now be stated as follows: compute a fused estimate xkkof the state xkand a consistent error covariance Pkkby means of the local tracks and knowledge on their models:


3. Least squares estimate

In order to compute an estimate as a well-suited combination of the local tracks, it is useful to consider the joint likelihood function given by the following Gaussian:


where Pkks,s=Pkksare the track covariances on the block-diagonal entries and Pkki,jcovxkki,xkkjxk=Pkkj,iTare the cross-covariances on the off-diagonal entries of the joint error covariance.

Since the joint likelihood from above is proportional to an exponential function:


the maximum likelihood(ML) estimate can be computed in terms of the least squares:


where xkk1:S=xkk1TxkkSTT, I1:S=IIT,and Pkk1:Sfor the joint error covariance have been introduced for notational simplicity. A closed form solution of the ML estimates can be obtained by setting the gradient with respect to the state to zero:


Therefore, the ML estimate is given by:


For information fusion applications, it is also important to have a consistent estimate of the squared error, in other words, we need to compute the corresponding error covariance:


The last equation holds due to the fact that EI1:Sxkxkk1:S2=Pkk1:Sis the joint covariance. Concluding the derivations from above equation, one can obtain:


4. Naïve fusion

It is obvious that for the ML estimate, it is assumed that the cross-covariances Pkki,j,i,j1Sare known. Since this might not be given in practical scenarios, a simple approximation is to assume them to be zero. This approach is called Naïve fusion. It implies that the joint error covariance is given in block-diagonal form:


As a direct consequence of the matrix inversion lemma(see Appendix 12.1), the inverse can be obtained in closed form solution:


Filling into the maximum likelihood formulas directly yields.


Thus, by means of a simple approximation of the ML estimate, we have obtained a first practical fusion rule for the FC, which we call convex combinationdue to its structure for further usage. The fusion scheme as a whole can be outlined schematically as in the flowing Figure 1. Each sensor node sprocesses its produced sensor data by means of a local filter, which results in a track in terms of an estimate together with an error covariance. These parameters are transmitted to the fusion center, which applies the convex combination to compute the fused result.

Figure 1.

Fusion scheme of the Naive Fusion approach.


5. What makes the Naïve fusion naïve?

For the Naïve fusion, we have assumed that the cross-covariances vanish. It is worth to be aware of the structure of the cross-covariances to see the conditions whether this holds or does not hold. This can be achieved by a recursive computation of the posterior cross-covariance Pkki,jof two sensors with indices iand j, which process their data by means of local Kalman filters. At the beginning of the estimation process at t0tracks are not yet correlated, that is, P00i,j=0, due to the fact that initial measurements are mutually uncorrelated. A recursive computation can be achieved by a prediction-filteringcycle.

5.1. Cross-covariance prediction

For the prediction step, it is assumed that a previous posterior cross-covariance Pkk1i,jhas been computed. The prior parameters are obtained by means of the motion model:


where the last equality holds due to the fact that the estimation errors at time tk1of both sensor processors are uncorrelated to the process noise wk.

5.2. Cross-covariance filtering

In the filtering step, both sensors compute their posterior parameters based on the produced measurements zkiand zkj,respectively. It is assumed that the local processors have applied local Kalman filter update steps with Kalman gains Wkk1iand Wkk1j. The cross-covariance posterior matrix can be obtained by a straightforward computation:


For these equations, we have used the fact that the prior estimate error xkxkk1iis independent of the measurement noise vki, and that vkiand vkjare mutually independent.

Concluding the calculations from this section, we have obtained a recursive solution for the cross-covariances:


One can see that the cross-covariances are zero, if and only if the process noise covariance Qkvanishes. In other words, if the tracks refer to a deterministically moving target and all sensors do local Kalman filtering, then the convex combination equations yield the optimal fusion result. If this is not the case, as maybe in most practical applications, then the Naïve fusion method is an approximation and its degree of approximation depends primarily on the level of the process noise.


6. Gaussian product representation

The basic concept of the distributed Kalman filter is to make the local parameters stochastically independent, even if process noise is present. This is achieved by a product representation, which directly follows from the fact that.


Thus, the Gaussian product representation is equivalent to uncorrelated track parameters for each processing node. It should be noted that the product representation is not normalized, that is, the integral for S>1is not unity. This, however, is not of practical relevance since the fused estimate density is a Gaussian and the parameters of which are provided by the convex combination.


7. Derivation of the distributed Kalman filter

For the DKF, we are going to modify the local processing scheme for each sensor in order to have the product representation hold at each instant of time. Then, when the fusion center receives the parameters from all sensors, the convex combination can be applied to compute the optimal global estimate. Note that the convex combination does not consider a local prior of the fusion center; therefore, the result will be independent from previous transmissions. This can be of great benefit, if communication channels with unreliable links have to be considered, since the full information on the target state is distributed in the sensor network. However, for completeness, it should also be noted that the modified local parameters are not optimal anymore in a local sense. One could say that local optimality is given up for the sake of global optimality [5].

In the following sections, the derivation of a prediction-filtering recursion of the DKF is discussed.

7.1. DKF prediction

For the prediction, it is assumed that the previous posterior is given in product representation:


For notational simplicity, we have conditioned the posterior on the full data set Zk1=Z1k1,,Zsk1, where Zsk1Z1sZk1sare from all sensors and all time steps up to time tk1, of which the local tracks are sufficient statistics. At the initialization phase, that is, k1=0, the product representation is trivial, since the initial estimates can be based on first measurements, which are mutually independent.

To derive a closed form solution for the prediction of product representation, it is required to globalizethe covariances Pk1k1sof the local processing nodes at first. This process changes the local parameters such that the same previous posterior density is factorized; however, the local covariances will be unified to a single Pk1k1. Rigorously speaking, this matrix does not represent a meaningful covariance in the sense of an expected estimation error squared anymore. Still, the fused result will be optimal since the global density is not changed during this process. Thus, if we set




then the same fused density will be obtained, which is easily verified by means of the convex combination. It should be noted that the remote error covariances Pk1k1sare required to compute the globalized covariance matrix Pk1k1. Since Kalman filter conditions are assumed, they can be computed without data transmission, as they do not depend on the local sensor measurements. Therefore, it is sufficient to be aware of the remote sensor models.

The prediction formulas can now be obtained by a marginalization of the joint density of the current and the last time step:


The last equality holds due to the Markov property of the system. Filling in our linear Gaussian transition model and the previous posterior yields


By means of a simple algebraic manipulation, it is possible to factorize the transition kernel Gaussian up to proportionality:


Thus, we can factorize the integration term of the global prior completely:


An application of the product formula (Section 12.2 in the appendix) yields:




At this point, we have derived factorized prediction formulas for the DKF prediction, and to our knowledge, the remaining integral is part of the normalization constant. This, however, is not trivial, since the parameter ysdepends on xk. A successive application of the product formula yields the desired result:


for some auxiliary variables y¯¯and Y¯¯. All factors, which are independent of xkhave been omitted. This proves that the integral is independent of the state variable xk. Concluding the derivations from above, we have derived the prediction formulas of the local estimation parameters as:


7.2. DKF filtering

Let Zkdenotes the set of measurements produced by all sensors at time tk. The posterior density can be inferred by using the Bayes theorem:


Due to the mutual independence of the measurement noises, the joint likelihood function is given by:


This is particularly useful for the structure of the product representation used for the DKF. Filling in the linear Gaussian models and neglecting the normalization constant in the denominator directly yields:


Thus, the product formula again can be applied to compute the posterior parameters:




Again, we have omitted the factors, which are independent of xk.


8. Information filter formulation of the DKF

In [6], an elegant derivation of the DKF formulas was published based on the information filter(IF). The IF uses information matrices, which are inverted covariances, and information states, which are information matrices multiplied with states. The optimal, centralized update formulas for Ssensors based on the predicted information parameters Pkk11and Pkk11xkk1are given by:


where iks=HksTRks1zksand Iks=HksTRks1Hksare the local information contribution from the current measurements at time tk, which were received by the FC. If we want to distribute the computation to Snodes, we will have them uncorrelated as in the DKF previously. Since the fused estimate is obtained via the convex combination, the local information parameters are summed up:


This summation structure can be used to provide a closed prediction-filtering cycle.

8.1. Information DKF prediction

The prediction of the state is easier than a direct transition of the information parameters. Based on the fused estimate, one can obtain.


Thus, we have given the local predicted state parameters as:


Analogously, one obtains for the prior covariance:


Thus, if we set Pkk1s=SPkk1=Fkk1SPk1k1Fkk1T+SQk, then the convex combination yields the exact global fused covariance.

8.2. Information DKF filtering

For the filtering, it is assumed that each sensor has computed its local information contribution parameter iksand Iksfrom its own sensor model and, in addition, the information matrix contributions Iklfrom all remote sensors lby using the individual sensor models which are again assumed to be known. Then, the information state is updated via.


As a direct consequence, the updated parameters of the local processors follow the standard IF filtering equations:


For the globalized information matrix, the remote information parameters from the sensor models are used:


It is important to note that the local processing nodes compute both the local pseudo information matrix Pkks1and the global fused error covariance Pkk, which is required for the prediction to the next time step.


9. Distributed accumulated state density filter

The DKF from Sections 7 and 8 can be considered a big step toward distributed state estimation, tracking, and information inference. However, in practical applications, the exact solution is often hindered by the fact that the exact remote sensor model parameters are unknown and can only be approximated based on local state estimates. The good news is that there is another exact solution based on the accumulated state density(ASD). The distributed ASD equations turn the spatial correlations into temporal correlations of successive states. Originally, the ASD equations were introduced to solve the out-of-sequence problem, which handles delayed transmissions of measurements into an ongoing fusion process in an optimal manner. Therefore, the temporal correlations can well be coped with the ASD approach.

At first, let us introduce the ASD state xk:nas


where tkrefers to the current time of the filtering process and tnrefers to the initialization time. The ASD approach now considers the conditional joint density pxk:nzk, that is, the posterior of the full trajectory of the target between tnand tk. In particular, the individual state densities of a single instant of time can be obtained via marginalization. Also, it should be noted that the Rauch-Tung-Striebel(RTS) smoothing equations are inherently integrated in the ASD posterior, since all states are conditioned on the full set of measurements up to time tk[7].

A recursive computation of the ASD posterior can be achieved by using the Bayes theorem:


Since the measurements conditioned on the whole trajectory only depend on the state at time tk, the joint likelihood function is given by:


The second factor can be reformulated as follows:


where we have used the Markov property of the system in the last equation. This recursive representation can now be repeated on the term pxk1:nZk1. A successive application of this procedure yields


where we have neglected the normalization constant in the denominator. Filling in our Gaussian models and using the factorization of the transition model from above equation yields


Since the initial density usually is based on a first measurement, we can assume that it factorizes into independent local track starts:


When the posterior is fully factorized in the number of sensors and in the time steps, each processing node can compute the resulting ASD Gaussian with mean xk:nksand covariance matrix Pk:nks[7]:


where the parameters are given by:


We have used a short notation such that xlk=Exlzkare the smoothed state estimates and Plk=covxlzkare the covariances, respectively, which result from the Rauch-Tung-Striebel equations. Also, the combined retrodiction gain matrices are known from the RTS smoother:


Thus, when the FC receives the local ASD parameters, the optimal fused estimate can be obtained via the convex combination:


For a continuous state estimation process, it is convenient to formulate the distributed ASD solution in terms of a prediction-filtering cycle.

9.1. Distributed ASD prediction

For the prediction step, it is assumed that the local processing node shas computed the previous filtering parameters xk1:nk1sand Pk1:nk1s, which refer to time tk1. Then, the prior parameters are given by:


9.2. Distributed ASD filtering

Since the prior is factorized in form of a product representation and the current measurements from time tkare mutually uncorrelated, local Kalman filters can be applied to obtain the posterior parameters:


10. Conclusion

In this chapter, we have introduced the least squares solution to the track-to-track fusion problem, where cross-covariances of the track estimation errors are required. Neglecting the cross-covariances has led us to the Naïve fusion, a simple but powerful fusion algorithm for practical applications. By recursive computation of the cross-covariances, we have seen that they primarily depend on the process noise of the state transition kernel. Since a centralized computation of the cross-covariances is infeasible in practical applications, more sophisticated solutions are required for optimal fusion results. The distributed Kalman filter, which uses the product representation to keep the local parameters decorrelated achieved this. However, this approach only works, if the local processors know all measurement models at each time step. Then, the distributed accumulated state density filter uses the temporal correlations to factorize the global posterior density. This approach does not require remote sensor models and is therefore, well suited for extensions with measurement ambiguity or nonlinear measurement functions.

In the study, one can find more extensions based on the distributed Kalman filter to overcome the lack of knowledge on the remote sensor models. In [8] and the references therein, a debiasing matrix is introduced to compensate for globally biased gain matrices of the local filters. An application of the tracklet fusion based on the distributed accumulated state density filter can be found in [9]. Then, in [10], the information filter formulation of the distributed Kalman filter also was extended to scenarios with input information on the transition process.

A. Appendix

A.1. Matrix inversion lemma

Let A,B,C, and Dbe matrices of a block matrix such that Aand Dare invertable, and also such that the Schur-complements DCA1Band ABD1Chave full rank. Then, the inversion of the block matrix is given by:


In particular, it holds that


Proof.Let the inverted block matrix be given by submatrices E,F,G,and H. By definition, it holds that


where Iand Oare the identity and zero matrix, respectively. A matrix multiplication of the first and second equality yields two blocks of equations:


which we call block Aand


which we call block B.Resolving block A for E,F,G,and Hyields the first version of the inverted block matrix, whereas resolving block B yields the second version.

A.2. Product formula for Gaussian densities

For two Gaussian distributed random variables xand z, it holds that




A proof can be found, for instance, in [11].

© 2017 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Felix Govaers (December 20th 2017). Distributed Kalman Filter, Kalman Filters - Theory for Advanced Applications, Ginalber Luiz de Oliveira Serra, IntechOpen, DOI: 10.5772/intechopen.71941. Available from:

chapter statistics

1121total chapter downloads

2Crossref citations

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Consensus-Based Distributed Filtering for GNSS

By Amir Khodabandeh, Peter J.G. Teunissen and Safoora Zaminpardaz

Related Book

First chapter

Highlighted Aspects From Black Box Fuzzy Modeling For Advanced Control Systems Design

By Ginalber Luiz de Oliveira Serra

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us