## Abstract

The problem is autonomous coordinated search by an interconnected group of moving robots for the purpose of finding and localising a source of hazardous emissions (e.g., gas and particles). Dispersion of the emitted substance is assumed to be affected by turbulence, resulting in the absence of concentration gradients. The chapter proposes a search strategy that operates in a completely decentralised manner, as long as the communication network of the moving robots forms a connected graph. By decentralised operation, we mean that each moving robot is reasoning (i.e., estimating the source location and making decisions on robot motion) locally. Coordination of the group is achieved by consensus via communication with the neighbours only, in a manner which does not require global knowledge of the communication network topology.

### Keywords

- autonomous search
- machine intelligence
- sequential Monte Carlo estimation
- infotaxis

## 1. Introduction

Searching strategies for finding targets using appropriate sensing modalities are of great importance in many aspects of life. In the context of national security, there could be a need to find a source of hazardous emissions [1, 2, 3]. Similarly, rescue and recovery missions may be tasked with localising a lost piece of equipment that is emitting weak signals [4]. Biological applications include, for example, protein searching for its specific target site on DNA [5], or foraging behaviour of animals in their search for food or a mate [6, 7]. The objective of search research [8] is to develop optimal strategies for localising a target in the shortest time (on average), for a given search volume and sensing characteristics.

The use of autonomous vehicles in dangerous missions, such as finding a source of hazardous emissions, has become widespread [9, 10, 11]. Existing approaches to the search and localisation in the context of atmospheric releases can be loosely divided into three categories: up-flow motion methods, concentration gradient-based methods and information gain-based methods, also known as *infotaxis*. Both the up-flow motion methods and the concentration gradient methods are simple, in the sense that they require only a limited level of spatial perception [12]. Their limitations manifest in the presence of turbulent flows, due to the absence of concentration gradients, when the plume typically consists of time-varying disconnected patches. The information gain-based methods [13] have been developed specifically for searching in turbulent flows. In the absence of a smooth distribution of concentration (e.g., due to turbulence), this strategy directs the searching robot(s) towards the highest information gain. As a theoretically principled approach, where the source-parameter estimation is carried out in the Bayesian framework and the searching platform motion control is based on the information-theoretic principles, the infotaxic (or cognitive) search strategies have attracted a great deal of interest [3, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23].

This chapter summarizes our recent results in development of an autonomous infotaxic coordinated search strategy for a group of robots, searching for an emitting hazardous source in open terrain under turbulent conditions. The assumption is that the search platforms can move and sense. Two types of sensor measurements are collected sequentially: (a) the concentration of the hazardous substance; (b) the platform location within the search domain. Due to the turbulent transport of the emitted substance, the concentration measurements are typically sporadic and fluctuating. The searching platforms form a moving sensor network, thus enabling the exchange of data and a cooperative behaviour. The multi-robot infotaxis have already been studied in [16, 17, 20, 24]. However, all mentioned references assumed *all-to-all* (i.e., fully connected) communication network with *centralised fusion and control* of the searching group.

We develop an approach where the group of searching robots operate in a fully decentralised coordinated manner. Decentralised operation means that each searching robot performs the computations (i.e., source estimation and path planning) locally and independently of other platforms. Having a common task, however the robotic platforms must perform in a coordinated manner. This coordination is achieved by exchanging the data with immediate neighbours only, in a manner which does not require the global knowledge of the communication network topology. For this reason, the proposed approach is scalable in the sense that the complexities for sensing, communication, and computing per sensor platform are independent of the sensor network size. In addition, because all sensor platforms are treated equally (no leader-follower hierarchy), this approach is robust to the failure of any of the searching agents. The only requirement for avoiding the break-up of the searching formation is that the communication graph of the sensor network remains *connected* at all times. Source-parameter estimation is carried out sequentially, and on each platform independently, using a Rao-Blackwellised particle filter. Platform path planning, in the spirit of *infotaxis*, is based on entropy-reduction and is also carried out independently on every platform.

## 2. Mathematical models

First, we describe the measurement model. The concentration measurements are modelled using a Lagrange encounters model developed in [13], based on an open field assumption and a two-dimensional geometry. Let *average* wind characteristics be the speed ^{1}

where

The probability that a sensor at location

Parameter

The motion model of a coordinated group of robots is described next. Let the pose vector of the

For each platform

The measurements of concentration are taken at time instants *travel time*) for the

Motion of the

where vector

Here, *known*

A robotic platform can communicate with another platform of the formation, if their mutual distance is smaller than a certain range

## 3. Decentralised sequential estimation

Estimation and robot motion control are carried out using the measurement dissemination-based decentralised fusion architecture [25]. Measurement locations^{2} and the corresponding measured concentration values, i.e., the triple

Suppose the posterior density function of the source at discrete-time

where

where

is independent of

where the posterior of source strength

Since the conjugate prior of the Poisson distribution is the Gamma distribution [28], the posterior

The parameters of the prior for source strength,

Next, we turn our attention to the posterior of source position

where

The Rao-Blackwellised particle filter (RBPF) fully describes the posterior

Here,

## 4. Decentralised formation control

In decentralised multi-robot search, each platform autonomously makes a decision at time

### 4.1. Selection of individual control vectors

A robot platform

where

Previous studies of search strategies [3, 20] found that the reward function defined as the *reduction of entropy*, results in the most efficient search. Hence, we adopt the expected reward defined as

where

while

where

Given that

In order to compute

where the product

Thus, (17) is approximated with

Pseudo-code of the routine for the computation of control vector on platform

Algorithm 1 Computation of

1: **Input:**

2: Compute

3: Create admissible set

4: **for** every **do**

5: Compute the future platform location

6: Compute

7: Determine

8: Compute

9: Calculate the expected reward

10: **end for**

11: Find

12: **Output:**

### 4.2. Cooperative control through consensus

So far, we have explained how platform

We apply decentralised cooperative control based on the average consensus [30, 31]. In a network of collaborating agents, consensus is an iterative protocol designed to reach an agreement regarding a certain quantity of interest. Suppose that every platform, as a node in the communication network, initially has an individual scalar value. The goal of average consensus is for every node in the network to compute the average of initial scalar values, in a completely decentralised manner: by communicating only with the neighbours in the communication graph (without knowing the topology of the communication graph).

In the problem we consider, there is not only a single individual scalar value, but six of them. They include three motion control parameters, i.e., for platform

Let us denote the scalar value of interest by

Ideally, we want every platform in the formation to compute the mean value

Average consensus is an iterative algorithm. At iteration

where *maximum degree weights* [32]. Other weights can be also used. It can be shown that if the communication graph is connected, the values

The search continues until the global stopping criterion is satisfied. The local stopping criterion is calculated on each platform independently based on the spread of the local positional particles

The global stopping criterion is computed on each platform using the average consensus algorithm, using (21), but with

We point out that both estimation and control are based on the consensus algorithm. While the cooperative control is using the *average* consensus (21), the decentralised measurement dissemination of Section 3 achieves the consensus on the set of measurements at time *time-invariant*. Furthermore, assuming bidirectional communication between the robots in formation, the network topology can be represented by an undirected graph. The convergence of the consensus algorithm for a time-invariant undirected communication topology is guaranteed if the graph is connected [31, 32, 33]. Note that this theoretical result is valid for an infinite number of iterations. In practice, if the communication graph at some point of time is not connected, or if an insufficient number of consensus iterations are performed, it may happen that one or more robots are lost (they could re-join the formation only by coincidence). This event, however, does not mean that the search mission has failed: the emitting source will be found eventually, albeit by a smaller formation in possibly longer interval of time.

## 5. Numerical results

The proposed search algorithm has been applied to an experimental dataset, collected by COANDA Research & Development Corporation using their large recirculating water channel. The emitting source was releasing fluorescent dye at a constant rate from a narrow tube. The dataset comprises a sequence of 340 frames of instantaneous concentration field measurements in the vertical plane and is sampled at every 10/23 s. The size of a frame is

An example of the search algorithm running on the experimental data is shown in Figure 2. All physical quantities are in arbitrary units (a.u.). The following environmental/sensing parameters were used:

Figure 2 displays the top-down view of the search progress at step indices **(a)–(c)** of Figure 2 show the particles before resampling: the particles are placed on a regular grid, thus mimicking a grid-based approach, with the value of particle weights indicated by the grey-scale intensity plot (white means a zero weight). This provides a good visual representation of the posterior **(d)** shows the situation after a non-zero concentration measurement was collected by the search team. The positional particles have been resampled at this point of time and moved closer to the true source location.

Using 200 Monte Carlo simulations, the mean search time for the algorithm was 2525 a.u., with a 5th and 95th quantile of 1840 and 3445 a.u., respectively. Note that in all simulations the formation started from the bottom right hand corner indicated in Figure 2(a).

## 6. Conclusions

The chapter presented a decentralised infotaxic search algorithm for a group of autonomous robotic platforms. The algorithm allows the platforms to search and locate a source of hazardous emissions in a coordinated manner without the need for a centralised fusion and control system. More precisely, this distributed coordination is achieved only by local exchange of measurement data between neighbouring platforms. Similarly, the movement decisions taken by the platforms were reached using a distributed average consensus algorithm over the whole formation. The key aspect is that individual platforms only require knowledge of their neighbours; the global knowledge of the communication network topology is unnecessary. An advantage of adopted distributed framework is that all platforms are treated equally, making the proposed search algorithm scalable and robust to the failure of a single platform. Numerical results using experimental data confirmed the robust performance of the algorithm. The main limitation of the algorithm is that the environmental parameters (such as diffusivity, the average direction and speed of the wind, particle lifetime), must be known. Future work will explore sensitivity to parametrisation and will aim to develop a team of “search and rescue” robots for further experimentation in realistic environments.

## Acknowledgments

This research was supported in part by the Defence Science and Technology Group through its Strategic Research Initiative on Trusted Autonomous Systems.

## Notes

- Robot locations are assumed to be non-coincidental with the source location r 0 .
- Because the measurement locations are assumed to be known exactly, they will not be treated as random variables.