Open access peer-reviewed chapter

Decentralised Scalable Search for a Hazardous Source in Turbulent Conditions

Written By

Branko Ristic and Christopher Gilliam

Reviewed: 26 April 2019 Published: 04 June 2019

DOI: 10.5772/intechopen.86540

From the Edited Volume

Unmanned Robotic Systems and Applications

Edited by Mahmut Reyhanoglu and Geert De Cubber

Chapter metrics overview

1,042 Chapter Downloads

View Full Metrics


The problem is autonomous coordinated search by an interconnected group of moving robots for the purpose of finding and localising a source of hazardous emissions (e.g., gas and particles). Dispersion of the emitted substance is assumed to be affected by turbulence, resulting in the absence of concentration gradients. The chapter proposes a search strategy that operates in a completely decentralised manner, as long as the communication network of the moving robots forms a connected graph. By decentralised operation, we mean that each moving robot is reasoning (i.e., estimating the source location and making decisions on robot motion) locally. Coordination of the group is achieved by consensus via communication with the neighbours only, in a manner which does not require global knowledge of the communication network topology.


  • autonomous search
  • machine intelligence
  • sequential Monte Carlo estimation
  • infotaxis

1. Introduction

Searching strategies for finding targets using appropriate sensing modalities are of great importance in many aspects of life. In the context of national security, there could be a need to find a source of hazardous emissions [1, 2, 3]. Similarly, rescue and recovery missions may be tasked with localising a lost piece of equipment that is emitting weak signals [4]. Biological applications include, for example, protein searching for its specific target site on DNA [5], or foraging behaviour of animals in their search for food or a mate [6, 7]. The objective of search research [8] is to develop optimal strategies for localising a target in the shortest time (on average), for a given search volume and sensing characteristics.

The use of autonomous vehicles in dangerous missions, such as finding a source of hazardous emissions, has become widespread [9, 10, 11]. Existing approaches to the search and localisation in the context of atmospheric releases can be loosely divided into three categories: up-flow motion methods, concentration gradient-based methods and information gain-based methods, also known as infotaxis. Both the up-flow motion methods and the concentration gradient methods are simple, in the sense that they require only a limited level of spatial perception [12]. Their limitations manifest in the presence of turbulent flows, due to the absence of concentration gradients, when the plume typically consists of time-varying disconnected patches. The information gain-based methods [13] have been developed specifically for searching in turbulent flows. In the absence of a smooth distribution of concentration (e.g., due to turbulence), this strategy directs the searching robot(s) towards the highest information gain. As a theoretically principled approach, where the source-parameter estimation is carried out in the Bayesian framework and the searching platform motion control is based on the information-theoretic principles, the infotaxic (or cognitive) search strategies have attracted a great deal of interest [3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23].

This chapter summarizes our recent results in development of an autonomous infotaxic coordinated search strategy for a group of robots, searching for an emitting hazardous source in open terrain under turbulent conditions. The assumption is that the search platforms can move and sense. Two types of sensor measurements are collected sequentially: (a) the concentration of the hazardous substance; (b) the platform location within the search domain. Due to the turbulent transport of the emitted substance, the concentration measurements are typically sporadic and fluctuating. The searching platforms form a moving sensor network, thus enabling the exchange of data and a cooperative behaviour. The multi-robot infotaxis have already been studied in [16, 17, 20, 24]. However, all mentioned references assumed all-to-all (i.e., fully connected) communication network with centralised fusion and control of the searching group.

We develop an approach where the group of searching robots operate in a fully decentralised coordinated manner. Decentralised operation means that each searching robot performs the computations (i.e., source estimation and path planning) locally and independently of other platforms. Having a common task, however the robotic platforms must perform in a coordinated manner. This coordination is achieved by exchanging the data with immediate neighbours only, in a manner which does not require the global knowledge of the communication network topology. For this reason, the proposed approach is scalable in the sense that the complexities for sensing, communication, and computing per sensor platform are independent of the sensor network size. In addition, because all sensor platforms are treated equally (no leader-follower hierarchy), this approach is robust to the failure of any of the searching agents. The only requirement for avoiding the break-up of the searching formation is that the communication graph of the sensor network remains connected at all times. Source-parameter estimation is carried out sequentially, and on each platform independently, using a Rao-Blackwellised particle filter. Platform path planning, in the spirit of infotaxis, is based on entropy-reduction and is also carried out independently on every platform.


2. Mathematical models

First, we describe the measurement model. The concentration measurements are modelled using a Lagrange encounters model developed in [13], based on an open field assumption and a two-dimensional geometry. Let ith robotic vehicle position (i=1,2,,N) at time tk be denoted by rkiR2. Suppose that the emitting source is located at coordinates specified by the vector r0=X0Y0 and its release rate, or strength, is Q0. The goal of the search is to detect and estimate the source-parameter vector η0=r0 Q0 in the shortest possible time. The particles released from the source propagate with combined molecular and turbulent isotropic diffusivity D, but can also be advected by wind. The released particles have an average lifetime τ before being absorbed. Let the average wind characteristics be the speed U and direction, which by convention, coincides with the direction of the x axis. Suppose a spherical concentration measuring sensor of small radius a is mounted on the ith robot, whose position at time k is1rki=xki yki. This sensor will experience a series of encounters with the particles released from the emitting source. The average rate of encounters can be modelled as follows [13]:


where D, τ and U are known environmental parameters, dkir0rki=xkiX02+ykiY02 is the distance between the source and the ith sensor platform, K0 is the modified Bessel function of the second kind of order zero, and λ=1+U2τ4D depends on environmental parameters only.

The probability that a sensor at location rki is hit by zZ+0 dispersed particles (where z is a non-negative integer) during a time interval t0 is Poisson distributed, i.e.,


Parameter μki=t0Rη0rki in (2) is the mean number of particles expected to reach the sensor at location rki during interval t0. Eq. (2) expressed the likelihood function of a concentration measurement zki collected by ith sensor, i.e., zkiη0=Pzkiμki.

The motion model of a coordinated group of robots is described next. Let the pose vector of the ith robot platform at time tk be denoted θki=rkiϕki, where rki=xkiyki has already been introduced and ϕki is the vehicle heading. The group of searching vehicles moves in a formation. The centroid of the formation at time tk is specified by coordinates:


For each platform i=1,,N, the offset ΔxiΔyi from the centroid xkcykc is predefined and known to it (i.e., xki=xkc+Δxi, yki=ykc+Δyi).

The measurements of concentration are taken at time instants tk, k=1,2,. Between two consecutive sensing instants, each platform is moving. Let the duration of this interval (referred to as the travel time) for the ith platform be Tki0. The assumption is that sensing is suppressed during the travel time.

Motion of the ith platform during interval Tki is controlled by linear velocity Vki and angular velocity Ωki. Given that the motion control vector uki=VkiΩkiTki is applied to the ith platform, its dynamics during a short integration time interval δTki can be modelled by a Markov process whose transitional density is πθtiθtδiuki=NθtiβθtδiukiQ. The process noise covariance matrix Q captures the uncertainty in motion due to the unforeseen disturbances. The vehicle motion function βθtδiuki is:


where vector Bk1i=εxiδTkiεyiδTki0 is introduced to compensate for a distortion of the formation due to process noise with parameters:


Here, x¯k1i and y¯k1i are the estimates of the coordinates of the formation centroid at k1 (that is of xk1c and yk1c, respectively) available to the ith platform. Coordinates xk1i and yk1i refer to the knownith vehicle position at k1. Figure 1 illustrates the trajectories of N=7 autonomous vehicles in a formation using the described transitional density πθtiθtδiuki. In the absence of process noise (i.e., Q=0), the vehicles would move in a perfect formation if (a) all control vectors are identical (i.e., uk1=uk2==ukN), and (b) all headings are identical (i.e., ϕk11=ϕk12==ϕk1i). In this case, each platform would know the true coordinates of the formation centroid (i.e., x¯ki=xkc, y¯ki=ykc, for i=1,,N), and hence the correction vectors Bk1i would be zero.

Figure 1.

An example of a formation of N= 7 searching platforms at k= 1, 2. The communication graphs (based on established links between the platforms) are indicated with green lines. Note that communication network topology is time-varying. The red line, starting from the centroid of the formation, indicates the instantaneous velocity vector.

A robotic platform can communicate with another platform of the formation, if their mutual distance is smaller than a certain range Rmax. Because of process noise in motion, the distance between the vehicles in the formation will vary and consequently the topology of the communication network graph may also vary. For simplicity, we will assume that communication links (when established) are error free. Figure 1 illustrates the communication graphs of a formation consisting of N=7 searching platforms at two consecutive time instants.


3. Decentralised sequential estimation

Estimation and robot motion control are carried out using the measurement dissemination-based decentralised fusion architecture [25]. Measurement locations2 and the corresponding measured concentration values, i.e., the triple xkiykizki, are exchanged via the communication network. The protocol is iterative. In the first iteration, platform i broadcasts its triple to its neighbours and receives from them their measurement triples. In the second, third and all subsequent iterations, platform i broadcasts its newly acquired triples to the neighbours, and accepts from them only the triples that this platform has not seen before (newly acquired). Providing that the communication graph is connected, after a sufficient number of iterations (which depends on the topology of the graph), a complete list of measurement triples from all platforms in the formation, denoted dk=xkiykizki1iN, will be available at each platform.

Suppose the posterior density function of the source at discrete-time k1 and platform i be denoted piη0d1:k1, where d1:k1d1,d2,,dk1. Given piη0d1:k1 and dk, the problem of sequential estimation is to compute the posterior at time k, i.e., piη0d1:k. Using the Bayes rule, the posterior is


where gdkη0 is the likelihood function. Assuming that individual platform measurements are conditionally independent, gdkη0 can be expressed as




is independent of Q0. The posterior density piη0d1:k is computed using the Rao-Blackwell dimension reduction scheme [26]. Using the chain rule, the posterior can be expressed as:


where the posterior of source strength piQ0r0d1:k will be worked out analytically, while the posterior of source position pir0d1:k will be computed using a particle filter. Following [27], we express the posterior piQ0r0d1:k1 with the Gamma distribution whose shape and scale parameters are κk1 and ϑk1, respectively. That is


Since the conjugate prior of the Poisson distribution is the Gamma distribution [28], the posterior pQ0r0d1:k is also a Gamma distribution with updated parameters κk and ϑk, i.e., pQ0r0d1:k=GQ0κkϑk. The computation of κk and ϑk can be carried out analytically as a function of r0 and the measurement set dk=rkizki1iN [27]:


The parameters of the prior for source strength, pQ0=Gκ0ϑ0 are chosen so that this density covers a large span of possible values of Q0.

Next, we turn our attention to the posterior of source position pir0d1:k in the factorised form (8). Given pr0d1:k1, the update step of the particle filter using dk applies the Bayes rule:


where fdkd1:k1=gdkr0d1:k1pr0d1:k1dr0 is a normalisation constant. The problem in using (11) is that the likelihood function gdkr0d1:k1 is unknown; only gdkη0 of (6) is known. Fortunately, it is possible to derive an analytic expression for gdkr0d1:k1:


The Rao-Blackwellised particle filter (RBPF) fully describes the posterior piη0d1:k by a particle system


Here, M is the number of particles, wkm,i is a (normalised) weight associated with the source position sample r0,km,i, while κki and ϑkm,i are the parameters of the corresponding Gamma distribution for the source strength. Initially, at time k=0, the weights are uniform (and equal to 1/M), rk,0m,i are the points on a regular grid covering a specified search area, while κ0i=κ0 and ϑ0m,i=ϑ0. The sequential computation of the posterior piη0d1:k using the RBPF is carried out by a recursive update of the particle system Ski over time.


4. Decentralised formation control

In decentralised multi-robot search, each platform autonomously makes a decision at time tk1 about its next control vector uki, as described in Section 4.1. However, in order to maintain the geometric shape of the formation and thus avoid its break-up, there is a need to impose a form of coordination between the platforms. This will be explained in Section 4.2.

4.1 Selection of individual control vectors

A robot platform i autonomously decides on the control vector uki using the infotaxis strategy [13], which can be formulated as a partially observed Markov decision process (POMDP) [29]. The elements of POMDP are (i) the information state, (ii) the set of admissible actions and (iii) the reward function. The information state at time tk1 is the posterior density piη0d1:k1; it accurately specifies the ith platform current knowledge about the source position and its release rate. Admissible actions can be formed with one or multiple steps ahead. A decision in the context of search is the selection of a motion control vector ukiU which will maximise the reward function. According to Section 2, the space of admissible actions U is continuous with dimensions: linear velocity V, angular velocity Ω and duration of motion T. In order to reduce the computational complexity of numerical optimisation, U is adopted as a discrete set with only myopic (one step ahead) controls. In addition, U is time-invariant and identical for all platforms. If V, O and T denote the sets of possible discrete-values of V, Ω and T, respectively, then U is the Cartesian product V×O×T . The myopic selection of the control vector at time tk on platform i is expressed as:


where D is the reward function and zki is the future concentration measurement collected by the ith platform if the platform moved under the control vU to position xkiyki. In reality, this future measurement is not available (the decision has to be made at time tk1), and therefore the expectation operator E with respect to the prior measurement PDF features in (13).

Previous studies of search strategies [3, 20] found that the reward function defined as the reduction of entropy, results in the most efficient search. Hence, we adopt the expected reward defined as


where Hk1 is the current differential entropy, defined as


while HkizkivHk1 is the future differential entropy (after a hypothetical control vector v has been applied to collect zki):


where dki=xkiykizki. The expectation operator E in (14) is with respect to the probability mass function Pzkid1:k1=zkiη0piη0d1:k1dη0, that is:


Given that piη0d1:k1 is approximated by a particle system Sk1i, one can approximately compute Hk1i, which features in (14), as


In order to compute EHkizkiv of (17), first note that Pzkid1:k1=Pzkiμ̂k1i, where μ̂k1i is the predicted mean rate of chemical particle encounters at location rki (where the platform i would move after applying a hypothetical control v), computed based on d1:k1. According to Section 2,


where the product κk1iϑk1m,i approximates the source release rate as the mean of the Gamma distribution with parameters κk1iϑk1m,i. Next, we find the value of zmax as the minimum value of z such that the cumulative distribution function z=0zPzμ̂k1i is greater than a certain threshold 1ε, where ε1. The summation (17) is then computed only for zki=0,1,,zmax. Computation of Hkzkiv is carried out according to (18), except that wk1m,i is replaced with wkm,i=wk1m,iPzkiμk1m,i, where


Thus, (17) is approximated with


Pseudo-code of the routine for the computation of control vector on platform i is given by Algorithm 1.

Algorithm 1 Computation of uki

1: Input:Sk1iwk1m,ir0,k1m,iκk1iϑk1m,i1mM,

2: Compute Hk1 using (18)

3: Create admissible set U=V×O×T

4: for every vUdo

5: Compute the future platform location rkiv

6: Compute μ̂k1i using (19)

7: Determine zmax s.t. z=0zmaxPzμ̂k1i>1ε

8: Compute EHkizkiv using (20)

9: Calculate the expected reward i using (14)

10: end for

11: Find uki using (13)

12: Output:uki

4.2 Cooperative control through consensus

So far, we have explained how platform i would independent of the other platforms in the formation determine the best action for itself, i.e., uki. In general, individual platforms will disagree on the best action, and in the extreme u1u2,uN. In order to maintain the shape of the formation during the motion period (from time tk1 to tk), the platforms need to reach an agreement on the common action uk, to be applied to all platforms at the same time. But this is not sufficient; according to the motion model in Section 2, the platforms also need to agree on the formation centroid coordinates and the common heading angle ϕk1 to be applied in (4).

We apply decentralised cooperative control based on the average consensus [30, 31]. In a network of collaborating agents, consensus is an iterative protocol designed to reach an agreement regarding a certain quantity of interest. Suppose that every platform, as a node in the communication network, initially has an individual scalar value. The goal of average consensus is for every node in the network to compute the average of initial scalar values, in a completely decentralised manner: by communicating only with the neighbours in the communication graph (without knowing the topology of the communication graph).

In the problem we consider, there is not only a single individual scalar value, but six of them. They include three motion control parameters, i.e., for platform i, Vki, Ωki and Tki, two formation centroid coordinates, i.e., x¯k1i, y¯k1i and the heading angle of each platform ϕk1i.

Let us denote the scalar value of interest by bi, that is


Ideally, we want every platform in the formation to compute the mean value b¯=1Ni=1Nbi. If all platforms in the formation were to use identical average values for motion control, centroid coordinates and heading, then their motion would be coordinated (except for process noise, which will be taken care of through vector Bk1i in (4)) and the shape of the formation would be maintained (provided Rmax is adequate).

Average consensus is an iterative algorithm. At iteration s=0, the node in the communication graph (the robot platform) will initialise its state bi0 using either a component of vector uki (if bi is a motion control parameter) or the platform pose θk1i (if bi is a formation centroid coordinate or heading angle). This value is locally available. The initial values of centroid coordinates are the actual ith platform coordinates, i.e., x¯k1i0=xk1i and y¯k1i0=yk1i. At each following iteration s=1,2,, each platform updates its state with a linear combination of its own state and the states of its current neighbours. Let us denote the set of current neighbours of platform i by Ji. Then [30]:


where Ji is the number of neighbours of platform i. This particular linear combination is based on the so-called maximum degree weights [32]. Other weights can be also used. It can be shown that if the communication graph is connected, the values bis after many iterations converge to the mean b¯ [32].

The search continues until the global stopping criterion is satisfied. The local stopping criterion is calculated on each platform independently based on the spread of the local positional particles r0,km,i, measured by the square-root of the trace of its sample covariance matrix Ck. For example, if the spread of particles on platform i is smaller than a certain threshold ϖ, then the local stopping criterion is satisfied and is given a value of one, otherwise it is zero. This local stopping criterion value (zero or one) becomes the initial state of the global stopping criterion on platform i, denoted σi0:


The global stopping criterion is computed on each platform using the average consensus algorithm, using (21), but with bi replaced by σi. After a sufficient number of iterations, S, platform i decides to stop the search if at least one of the platforms in the formation has reached the local stopping criterion, that is, if σiS>0.

We point out that both estimation and control are based on the consensus algorithm. While the cooperative control is using the average consensus (21), the decentralised measurement dissemination of Section 3 achieves the consensus on the set of measurements at time k. The consensus algorithm is iterative, and hence its convergence properties are very important. First note that, although the network topology changes with time (as the robots move while searching for the source), during the short interval of time when the exchange of information takes place, the topology can be considered as time-invariant. Furthermore, assuming bidirectional communication between the robots in formation, the network topology can be represented by an undirected graph. The convergence of the consensus algorithm for a time-invariant undirected communication topology is guaranteed if the graph is connected [31, 32, 33]. Note that this theoretical result is valid for an infinite number of iterations. In practice, if the communication graph at some point of time is not connected, or if an insufficient number of consensus iterations are performed, it may happen that one or more robots are lost (they could re-join the formation only by coincidence). This event, however, does not mean that the search mission has failed: the emitting source will be found eventually, albeit by a smaller formation in possibly longer interval of time.


5. Numerical results

The proposed search algorithm has been applied to an experimental dataset, collected by COANDA Research & Development Corporation using their large recirculating water channel. The emitting source was releasing fluorescent dye at a constant rate from a narrow tube. The dataset comprises a sequence of 340 frames of instantaneous concentration field measurements in the vertical plane and is sampled at every 10/23 s. The size of a frame is 49×49 pixels, where a pixel corresponds to a square area of 2.935×2.935mm2. As the size of the data is relatively small, we follow the approach used in [24]: upscale each frame by a factor of 3 using bicubic interpolation and place the result in the top left corner of a 500×500 search area. A measurement obtained by a platform is, thus, the integer value of the concentration of the dye taken from the closest spatial and temporal sample from the experimental data.

An example of the search algorithm running on the experimental data is shown in Figure 2. All physical quantities are in arbitrary units (a.u.). The following environmental/sensing parameters were used: D=1, τ=250, U=0, a=1 and t0=1. Algorithm parameters are selected as follows: κ0=3, ϑ0=5.2, number of particles M=252, V=1, O=321,0,1,2,3 degrees per unit of time and T=0.5,1,2,4,8,16,32,64. The number of iterations, both for the exchange of measurement triples and in the consensus algorithm, was fixed to 30. The local search stopping threshold was ϖ=3.

Figure 2.

Experimental dataset: an illustrative run of the decentralised multi-robot search using N= 7 platforms. Graphs (a)–(d) show the positions and trajectories of the platforms at step indices k= 0,12,22 and 32, respectively. The concentration of the plume is represented in grey-scale (darker colours represent higher concentration).

Figure 2 displays the top-down view of the search progress at step indices k=0,12,22,32. The formation consists of N=7 platforms, whose trajectories are shown in different colours. The search algorithm terminated at K=33. Note that the plume size is much smaller than the search area. Panels (a)–(c) of Figure 2 show the particles before resampling: the particles are placed on a regular grid, thus mimicking a grid-based approach, with the value of particle weights indicated by the grey-scale intensity plot (white means a zero weight). This provides a good visual representation of the posterior pr0d1:k. Panel (d) shows the situation after a non-zero concentration measurement was collected by the search team. The positional particles have been resampled at this point of time and moved closer to the true source location.

Using 200 Monte Carlo simulations, the mean search time for the algorithm was 2525 a.u., with a 5th and 95th quantile of 1840 and 3445 a.u., respectively. Note that in all simulations the formation started from the bottom right hand corner indicated in Figure 2(a).


6. Conclusions

The chapter presented a decentralised infotaxic search algorithm for a group of autonomous robotic platforms. The algorithm allows the platforms to search and locate a source of hazardous emissions in a coordinated manner without the need for a centralised fusion and control system. More precisely, this distributed coordination is achieved only by local exchange of measurement data between neighbouring platforms. Similarly, the movement decisions taken by the platforms were reached using a distributed average consensus algorithm over the whole formation. The key aspect is that individual platforms only require knowledge of their neighbours; the global knowledge of the communication network topology is unnecessary. An advantage of adopted distributed framework is that all platforms are treated equally, making the proposed search algorithm scalable and robust to the failure of a single platform. Numerical results using experimental data confirmed the robust performance of the algorithm. The main limitation of the algorithm is that the environmental parameters (such as diffusivity, the average direction and speed of the wind, particle lifetime), must be known. Future work will explore sensitivity to parametrisation and will aim to develop a team of “search and rescue” robots for further experimentation in realistic environments.



This research was supported in part by the Defence Science and Technology Group through its Strategic Research Initiative on Trusted Autonomous Systems.


  1. 1. Hutchinson M, Liu C, Chen W-H. Information-based search for an atmospheric release using a mobile robot: Algorithm and experiments. IEEE Transactions on Control Systems Technology. 2018;99:1-15
  2. 2. Ristic B, Morelande M, Gunatilaka A. Information driven search for point sources of gamma radiation. Signal Processing. 2010;90:1225-1239
  3. 3. Ristic B, Skvortsov A, Gunatilaka A. A study of cognitive strategies for an autonomous search. Information Fusion. 2016;28:1-9
  4. 4. Haley KB, Stone LD, editors. Search Theory and Applications. Nato Conference Series. New York: Plenum Press; 1980
  5. 5. Halford SE. How do site-specific DNA-binding proteins find their targets? Nucleic Acids Research. 2004;32(10):3040-3052
  6. 6. Fauchald P, Tveraa T. Using first-passage time in the analysis of area-restricted search and habitat selection. Ecology. 2003;84(2):282-288
  7. 7. Viswanathan GM, Afanasyev V, Buldyrev SV, Murphy EJ, Prince PA, Satnley HE. Levy flight search patterns of wandering albatrosses. Nature. 1996;381:413-415
  8. 8. Shlesinger MF. Mathematical physics: Search research. Nature. 2006;443(7109):281-282
  9. 9. Bayat B, Crasta N, Crespi A, Pascoal AM, Ijspeert A. Environmental monitoring using autonomous vehicles: A survey of recent searching techniques. Current Opinion in Biotechnology. 2017;45:76-84
  10. 10. Dunbabin M, Marques L. Robots for environmental monitoring: Significant advancements and applications. IEEE Robotics and Automation Magazine. 2012;19(1):24-39
  11. 11. Ishida H, Wada Y, Matsukura H. Chemical sensing in robotic applications: A review. IEEE Sensors Journal. 2012;12(11):3163-3173
  12. 12. Masson J-B, Bailly-Bachet M, Vergassola M. Chasing information to search in random environments. Journal of Physics A: Mathematical and Theoretical. 2009;42:434009
  13. 13. Vergassola M, Villermaux E, Shraiman BI. ‘Infotaxis’ as a strategy for searching without gradients. Nature. 2007;445(25):406-409
  14. 14. Barbieri C, Cocco S, Monasson R. On the trajectories and performance of infotaxis, an information-based greedy search algorithm. Europhysics Letters. 2011;94(2):20005
  15. 15. Fatès N. Collective infotaxis with reactive amoebae: A note on a simple bio-inspired mechanism. In: International Conference on Cellular Automata; 2016. pp. 157-165
  16. 16. Hajieghrary H, Ani Hsieh M, Schwartz IB. Multi-agent search for source localization in a turbulent medium. Physics Letters A. 2016;380(20):1698-1705
  17. 17. Hajieghrary H, Tomas AF, Hsieh MA. An information theoretic source seeking strategy for plume tracking in 3D turbulent fields. In: Proceedings of the IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR); 2015. pp. 1-8
  18. 18. Hein AM, McKinley SA. Sensing and decision-making in random search. Proceedings of the National Academy of Sciences of the United States of America. 2012;109:12070-12074
  19. 19. Hutchinson M, Oh H, Chen W-H. Entrotaxis as a strategy for autonomous search and source reconstruction in turbulent conditions. Information Fusion. 2018;42:179-189
  20. 20. Masson J-B. Olfactory searches with limited space perception. Proceedings of the National Academy of Sciences of the United States of America. 2013;110(28):11261-11266
  21. 21. Moraud EM, Martinez D. Effectiveness and robustness of robot infotaxis for searching in dilute conditions. Frontiers in Neurorobotics. 2010;4:1-8
  22. 22. Ristic B, Skvortsov A, Walker A. Autonomous search for a diffusive source in an unknown structured environment. Entropy. 2014;16(2):789-813
  23. 23. Voges N, Chaffiol A, Lucas P, Martinez D. Reactive searching and infotaxis in odor source localization. PLoS Computational Biology. 2014;10(10):e1003861
  24. 24. Ristic B, Angley D, Moran B, Palmer JL. Autonomous multi-robot search for a hazardous source in a turbulent environment. Sensors. 2017;17(4):918
  25. 25. Hlinka O, Hlawatsch F, Djuric PM. Distributed particle filtering in agent networks: A survey, classification, and comparison. IEEE Signal Processing Magazine. 2013;30(1):61-81
  26. 26. Doucet A, De Freitas N, Murphy K, Russell S. Rao-Blackwellised particle filtering for dynamic Bayesian networks. In: Proceedings of the Sixteenth Conference on Uncertainty in Artificial Intelligence; 2000. pp. 176-183
  27. 27. Ristic B, Gunatilaka A, Wang Y. Rao–Blackwell dimension reduction applied to hazardous source parameter estimation. Signal Processing. 2017;132:177-182
  28. 28. Gelman A, Carlin JB, Stern HS, Rubin DB. Bayesian Data Analysis. 2nd ed. Boca Raton FL: CRC Press; 2004
  29. 29. Chong EKP, Kreucher C, Hero AO. Chapter 8. POMDP approximation using simulation and heuristics. In: Hero AO, Castanon D, Cochran D, Kastella K, editors. Foundations and Applications of Sensor Management. New York: Springer; 2008
  30. 30. Xiao L, Boyd S. Fast linear iterations for distributed averaging. Systems & Control Letters. 2004;53(1):65-78
  31. 31. Ren W, Beard RW, Atkins EM. Information consensus in multivehicle cooperative control. IEEE Control Systems. 2007;27(2):71-82
  32. 32. Xiao L, Boyd S, Lall S. A scheme for robust distributed sensor fusion based on average consensus. In: Proceedings of the 4th International Symposium on Information Processing in Sensor Networks; 2005. p. 9
  33. 33. Olfati-Saber R, Fax JA, Murray RM. Consensus and cooperation in networked multi-agent systems. Proceedings of the IEEE. 2007;95(1):215-233


  • Robot locations are assumed to be non-coincidental with the source location r0.
  • Because the measurement locations are assumed to be known exactly, they will not be treated as random variables.

Written By

Branko Ristic and Christopher Gilliam

Reviewed: 26 April 2019 Published: 04 June 2019