Open access peer-reviewed chapter

Bayesian Networks for Decision Support in Emergency Response: A Model for Missing Person Investigations

Written By

Denis Reilly

Submitted: 18 October 2021 Reviewed: 25 April 2022 Published: 28 June 2022

DOI: 10.5772/intechopen.105047

From the Edited Volume

Contemporary Issues in Information Systems - A Global Perspective

Edited by Denis Reilly

Chapter metrics overview

105 Chapter Downloads

View Full Metrics

Abstract

The successful operation of Emergency services (Police, Fire, Medical Emergency) relies heavily upon Information Systems and particularly Decision Support Systems. Missing person cases consume resources from the already overstretched resources of Police Forces. Such cases predominantly come from at-risk groups such as children in care, people suffering from depression, or elderly people suffering from dementia. This chapter reviews current practices used for missing person cases and describes a decision support model based on Bayesian networks.

Keywords

  • Bayesian networks
  • algorithms
  • missing persons
  • decision support
  • probability

1. Introduction

Emergency services face increasing demands in their challenges to keep the public safe and healthy. They typically utilize a variety of information systems to store data relating to previous incidents and recall data to assist with new incidents and tasks such as capacity planning. One particular class of information systems that play a vital role in emergency service operations is decision support systems. Such systems combine data and logic together with rules and heuristics to allow operational decisions to be made based on the domain knowledge embodied within the system. Typical examples of such systems are utilized by Police Forces in the development of search strategies for locating missing persons.

1.1 Missing persons

According to National Guidelines set out for UK Police Forces, A missing person is defined as:

‘Anyone whose whereabouts cannot be established and where the circumstances are out of character or the context suggests the person may be subject of crime or at risk of harm to themselves or another’.

When someone is categorized as missing, the police will investigate their disappearance and try to find and safeguard them.

In 2013 the Guidelines introduced a second absent person category, defined as:

‘A person not at a place where they are expected or required to be’ and perceived to be ‘not at any apparent risk’.

When someone is categorized as absent, no police response is required except to monitor and review the situation.

Typically absent cases involve individuals who go missing frequently (often referred to as frequent fliers). They are likely to be designated a missing person for the first few times that they are missing, but, if they return unharmed, thereafter they may be designated absent.

Missing person cases are both time consuming and resource intense, particularly in urban areas. Figure 1 highlights the scale of misper cases facing UK Police Forces and the volume of calls generated from dealing with such cases.

Figure 1.

Key statistics for missing person in the UK, 2016–2017.

Mispers come from a spectrum of the population. Many are children (teenagers) who go missing from care homes; others are adults with mental illness or depression. Cases also include elderly people suffering from dementia-related conditions. Murder (homicide) cases, manslaughter cases and death by misadventure often start out as misper cases until a body is located. Current practice relies on heuristics and localized domain knowledge. Social scientists will often interview mispers in the hope of eliciting knowledge in relation to mispers’ intentions while they were missing. Typically, Police rely heavily on historical data and behavioral patterns. For example, many teenagers who go missing are found in local parks, where teenagers are known to congregate. Elderly people suffering from dementia may travel to a location associated with their past.

1.2 Bayesian networks

Many problems in machine learning are solved by using supervised learning techniques, in which specific training input patterns are input to the model. Supervised learning is often the preferred solution of choice and powerful models such as Neural Networks and Support Vector Machines (SVMs) are available to implement supervised learning solutions. However, many cases exist where supervised learning is not applicable, typically when there is not one target variable of interest but many, or when different variables might be available or missing for each data point. Such examples include diagnosis in medicine, with many different types of diseases, symptoms, and context information available for a given patient. In a similar fashion, the problem of predicting the location of a missing person, or the distance that they may have traveled, poses similar problems to those found in medicine.

Bayesian networks (BNs) can deal with such challenges. BNs are seen as a popular choice for probabilistic reasoning and machine learning problems that are difficult to address with supervised learning techniques. BNs are undergoing a renaissance amongst the machine learning community as an effective probabilistic model that can be used to assist decision support. Implemented as a graphical network and supported by libraries in Python and R (e.g. bnlearn and Pomegranate), they allow probability inferences to generate ‘what if’? and ‘which is best’? In addition to medicine, BNs have been successfully applied to genetics, search and rescue (SAR) and general classification problems.

The particular strengths and weaknesses of BN may be summarized as:

  • They provide a natural way to handle missing data

  • Suitable for small and incomplete datasets

  • Combine different sources of knowledge

  • Explicit treatment of uncertainty and support for decision analysis

  • Fast response to queries

The main drawback of BNs is their inability to deal with continuous data, which needs to be discretized. Section 3 describes the application of BNs to develop a model to assist the decision-making process for misper cases. A more detailed consideration of the approach is described in [1].

Advertisement

2. Current approaches for missing person searches

The sections below review the main research approaches for dealing with missing person cases, which range from empirical techniques to formalized approaches.

2.1 Bayesian networks for search and rescue

There is some notable research concerned with the use of BN for Search And Rescue (SAR), in relation to people and objects who have either become lost or gone missing by accident. The distinction of course is that at-risk misper groups have intentionally gone missing, whilst SAR cases deal with entities, which have unintentionally become lost. The most notable use of Bayesian inference for search techniques was that of the search for Air France Flight AF 447, which crashed into the Atlantic on 1st June 2009 [2]. After 2 years of unsuccessful searching, the team used a Bayesian procedure developed for search planning to produce the posterior target location distribution. The distribution was used to guide the search and the wreckage was located within a week.

Reference [3] describes a Bayesian approach to modeling lost person behaviors based on terrain features in Wilderness Search and Rescue. The approach uses a first-order Markov transition matrix for generating a temporal, posterior predictive probability distribution map. The approach also uses a Bayesian χ2 test for goodness-of-fit and goes on to show that the model closely fits a synthetic dataset. Reference [4] provides a study of missing person behavior in Australia, conducted by Victoria Police. The study, which is part of the SARBayes project, considers a large dataset of parameters, some of which have more significance than others. Terrain plays an important role and the range of activities, relating to the missing person, are also considered (e.g. climbing, canoeing, hunting).

2.2 Machine learning and formal approaches

There are also several other machine learning-related approaches for dealing with missing person cases. For example, [5] compares the use of neural networks and rule-based systems for missing person cases in Australia. In later work [6] considers the use of J48 to derive rules, based on the popular C4.5 decision tree generator.

In the author’s own previous work [7] a missing person model was developed based on Situation Calculus. The approach represented the state changes that take place over time, whilst missing. The formalisms help to provide a consistent means to represent the uncertainty present in such investigations.

2.3 Empirical approaches

Two widely accepted empirical approaches are those of the UK booklet ‘Missing Persons: Understanding, Planning and Responding’ (colloquially referred to as the Grampian Study) [8] and the iFIND System [9], which is currently used by a number of UK Police forces.

The Grampian Study considers a similar set of at-risk groups to that of the author’s work. For each group the study provides a number of tables, which portray useful information, such as likely time periods of missing, distance traveled and likely places where a misper could be found. The Grampian Study also translates data into useful search ranges that can be superimposed on a map (Figure 2).

Figure 2.

Grampian search profiles for 5–8 year olds.

iFIND follows a similar structure but is based on more recent data to provide more thorough coverage. iFIND provides more detail in terms of possible locations. Both Grampian and iFIND place emphasis on Time, Distance and Likely Location and these parameters also feature predominantly in the author’s work. Figure 3 shows a typical excerpt from iFIND in the form of a table, which highlights the places where mispers for the category were located. The majority are found outside locally, with a smaller proportion either returning home or being found at a friend’s house.

Figure 3.

iFIND table of likely locations for 5–8 year olds.

2.4 Geospatial approaches

Other research conducted by the author led to the development of the CASPER System (Computer Assisted Search Prioritization and Environmental Response) [10] to study the Geographies of Missing Persons. CASPER used primary and applied research and secondary data analysis to develop a Google map application to assist investigative and strategic decision-making. CASPER was developed to a prototype stage and demonstrated to several Police forces as a viable alternative to existing case management systems COMPACT [11] and NICHE [12] systems.

CASPER (Figure 4) was rich in terms of the geospatial information it provided, being able to display heatmaps, places of interest and even live CCTV footage. CASPER allows the search team to overlay a range of different layers onto a map region of interest. For example, the team may choose to overlay information on ATM cash machines if it is known that a misper is short of money. Alternatively, suicide hotspots can be overlaid (from precompiled suicide data) when dealing with a potential suicide case.

Figure 4.

CASPER missing persons prototype.

However, the algorithms used in CASPER were largely rule-based, developed from ‘people like you’ approaches, based on if-elseif-else structures.

Advertisement

3. Bayesian network theory

BNs are directed graphical models that have been used extensively in the fields of cognitive science and artificial intelligence throughout the latter half of the 20th and early 21st centuries. The models are based on the theorem of Thomas Bayes [13], which allow probabilities to be updated in light of new evidence. BNs have been used for some time within the AI community and more recently amongst the machine learning community. Reference [14] provides an excellent account of how probability theory and decision theory began to attract the attention of the AI community in the late 1980s, which, when combined with graph theory, led to what we refer to today as Bayesian Networks.

Formally, for a discrete random variable X=X1Xn, a BN is an annotated directed acyclic graph, which encodes a joint probability distribution (JPD) over X. Formally, a BN can be expressed as the pair N=GΘ. The first element in N, is a directed acyclic graph, G = (V, E). V denotes the random variables in X, and E denotes the edges, which represent direct dependencies between the variables. The second element Θ denotes the set of parameters, which quantify the network, via conditional probability tables. Each node is annotated with a conditional probability distribution, PXi|PaXi, representing the conditional probability of the node Xi given its parents in G. The network N defines a unique JPD over X given by:

PX1Xn=i=1NPXi|PaXiE1

In a BN, a conditional probability P(X | Y) is the probability of an event X occurring given that Y occurs. A marginal probability is effectively an unconditional probability. A marginal probability is a distribution formed by calculating the subset of a larger probability distribution. For example, given a JPD P(X, Y) to determine the probability of X all the values for X = False and X = True can be summed in the joint table. When a node is queried in a BN, the result is often referred to as the marginal for that node.

For BNs, inference, is the computational method for deriving answers to queries given a probability model expressed as a BN. Inference in BNs can take on several different forms [15, 16], broadly speaking, it may be exact or approximate, depending upon the structure of the graph. Exact inference is not always possible when the number of combinations and paths is excessively large. However, it is often possible to refactor a BN graph (i.e. alter the graph structure) before resorting to approximate inference.

Let U be the set of random variables. Let UeU be the set of known (evidence) variables. Let XqU\Ue be the variables of interest (queries) and let Ur=U\UeXq be the set of remaining variables.

The probability distribution of the evidence variables and the query variables via marginalization, can be calculated as:

PXqUe=UrPX1XNE2

The normalization may be calculated as:

PUe=UqPXqUeE3

Then conditional probabilities may be calculated as:

PXq|Ue=PXqUePUeE4

A range of different questions can be asked, via inference, in relation to the probability distribution, such as:

  • Diagnosis: P(X = cause | U = symptom)

  • Prediction: P(X = symptom | U = cause)

  • Classification: max P(X = class | U = data)

  • class

  • Decision-making (given a cost function)

The different questions may be considered in relation to how classical statistical models help to estimate either the value of something - regression framework or the state of something - classification framework. Prediction and classification would be as per the classical statistical model framework. Diagnosis adds to this by defining the outcomes from the model in organizational terms (e.g. Policing terms, such as high priority cases). Decision-making adds further to this by using the diagnosis to guide Policing activities. Decision-making effectively translates the statistical model into operational terminology for operational decision-making.

Essentially, a BN defines a unique JPD over X and computationally the JPD takes the form of a large table, constructed from the tables defined at individual nodes, in accordance with the graph links. So computationally, inference is the process of scanning the joint table to find a value (or values), which correspond to evidence E, possibly summing values along the way. Often, the table will take the form of a sparse matrix and this property can be exploited to make inference tractable, even when the number of parameters is very large. There are certain legal rearrangements of the JPD table in which certain parameters can be marginalized. Such rearrangements allow queries to be satisfied in linear-time methods by identifying a subgraph of the original graph relevant to the query [17].

Advertisement

4. Misper-Bayes model for missing person investigations

The Misper-Bayes model was developed from previous research conducted by the author [1]. The model represents the different categories of at-risk missing persons and their breakdown in terms of sex type and age. The model also represents the likely times that a person may be missing, and the distance traveled as well as the likely locations where they may be found. Data from iFIND was used to determine the male/female split and the same set of at-risk categories as iFIND were adopted. All of the iFIND data was examined and a set of network parameters were compiled.

The Misper-Bayes model is shown below (Figure 5) in terms of a digraph and associated conditional probability tables. The nodes in the graph represent the random variables, which are linked through the conditional probability tables. Most of the tables are fairly self-explanatory, with a couple of exceptions: the Cat(x) table reflects the different categories of at-risk mispers and the Loc(x) table reflects the different locations where mispers are likely to be found. Note that there is no edge connecting Cat(x) and Time(x) (although there was an edge in an earlier version of the model). It was found that Age(x) provides a better predictor of the time spend missing than Cat(x). For example, the age of a young child or an elderly subject has a direct bearing on the time that they are missing. Other variables were included in earlier versions of the model, such as race or ethnicity, but these were seen to have a lesser effect than the variables shown in Figure 5.

Figure 5.

Misper-Bayes model.

Recalling Eq. (1), which gives the JPD over X given as:

PX1Xn=i=1NPXi|PaXi

The Misper-Bayes graphical model can be written as:

PLDCTSA=PL|D,C)PD|TPC|S,APT|APSPAE5

where:

  1. P(A) is the probability of the different age groups.

  2. P(S) is the probability of the sex types male and female.

  3. P(T | A) is the conditional probability of time missing, based on age.

  4. P(C | S, A) is the conditional probability of the different categories, based on sex type and age group.

  5. P(D | T) is the conditional probability of distance traveled, based on time missing.

  6. P(L | D, C) is the conditional probability of the likely location, based on the different categories and the distance traveled.

Software libraries are available for common programming languages, which can be used to implement BNs. Two popular libraries for the Python programming language are pomegranate [18] and bnlearn [19]. bnlearn is a library, which can be used to learn the structure of a BN and estimate the parameters, based on a dataset [20]. bnlearn was used in this instance to learn the network structure based on data from iFIND. After several iterations and variable eliminations, the development process arrived at a graph similar to Figure 5. bnlearn starts with an empty network structure of all variables, then proceeds by adding, removing and reversing edges between nodes to maximize the goodness of fit of the model. The final structure, learned by bnlearn, contained an excessive number of edges, likely due to overfitting (i.e. noise within the data had been represented in the model itself). The unnecessary edges were thinned out, based on the interpretation of the causal relationships between variables to deliver the final structure of Figure 5. Finally, bnlearn was used to learn the parameters using iFIND data compiled from the summary table (Figure 3).

The model was trialed using a series of realistic misper cases and the results were promising (approx.. 90% agreement) in relation to those of iFIND. A full set of the results are available in [1].

Advertisement

5. Conclusions

The chapter has described the design and implementation of the Misper-Bayes model, which can be used to assist Police forces in determining the whereabouts of a missing person. Misper-Bayes provides a powerful tool, which can be used to good effect to whittle down the likely locations where the missing person may be found. The Misper-Bayes model was evaluated using a series of queries with a set of misper cases. For each query, the results of the model were cross-checked against the results of the iFIND system and the accuracy was approx.. 90%. The strength of the model lies in its simplicity yet versatility. When combined with a geospatial front-end (e.g. CASPER) [21, 22, 23], the Misper-Bayes model can be used to very good effect to assist Police Officers with the prioritization of their search strategy. The approach has also demonstrated the scope of BNs to support evidence-based policing beyond that of missing person cases.

References

  1. 1. Reilly D, Taylor M, Fergus P, Chalmers C, Thompson S. Misper-Bayes: A Bayesian network model for missing person investigations. IEEE Access. 2021;9:49990-50000
  2. 2. Stone LD, Keller CM, Kratzke TM, Strumpfer JP. Search for the wreckage of air France flight AF 447. Statistical Science. 2014;29(1):69-80
  3. 3. Lin L, Goodrich MA. A Bayesian approach to modeling lost person behaviors based on terrain features in wilderness search and rescue. Computational and Mathematical Organization Theory. 2010;16:300-323
  4. 4. Sava E, Twardy C, Koester R, Sonwalkar M. Evaluating lost person behavior models. Transactions in GIS. 2016;20(1):38-53
  5. 5. Blackmore K, Bossomaier T, Foy S, Thomson D. Data mining of missing persons data. In: Proc. 1st International Conference on Fuzzy Systems and Knowledge Discovery: Computational Intelligence for the E-Age, Singapore. Berlin, Heidelberg: Springer; 2002
  6. 6. Blackmore K, Boosomaier T. Comparison of See5 and J48: PART algorithms for missing person profiling. In: Proc. 1st International Conference on Information Technology and Applications, ICITA 1–6, Australia. 2002
  7. 7. Taylor MJ, Reilly D. Knowledge representation for missing persons investigations. Journal of Systems and Information Technology. 2017;19(2):138-150
  8. 8. Gibb G, Woolnough P. Missing Persons: Understanding, Planning, Responding – A Guide for Police Officers. Aberdeen: Grampian Police; 2017. Available from: http://www.searchresearch.org.uk/downloads/ukmpbs/GGIbb_missing_person_report.pdf
  9. 9. N. Eales. “iFIND”, National Crime Agency (NCA), London, UK [Online]. 2015. Available from: https://missingpersons.police.uk/en-gb/resources/downloads/iFIND
  10. 10. Reilly D, Wren C, Giles S, Cunnigham L, Hargreaves P. CASPER: Computer assisted search prioritisation and environmental response application. In: Proc. Sixth International Conference on Developments Is E-Systems Engineering (DeSE’06), Abu Dhabi, UAE. IEEE; 2013. pp. 225-230
  11. 11. WPC Software. COMPACT. Available from: https://www.wpcsoft.com/business-areas/compact
  12. 12. NicheRMS. [Online]. Available from: https://nicherms.com/products
  13. 13. Stone JV. Bayes Rule: A Tutorial Introduction to Bayesian Analysis. Sebtel Press; 2013. ISBN 978-0-9563728-4-0
  14. 14. D’Ambrosio B. Inference in Bayesian networks. AI Magazine. 1999;20(2):21
  15. 15. Darwiche A. Modelling and Reasoning with Bayesian Networks. Cambridge: Cambridge University Press; 2009
  16. 16. Larranaga P, Karshenas H, Bielza C, Santana R. A review on evolutionary algorithms in Bayesian network learning and inference tasks. Elsevier Information Sciences. 2013;233:109-125
  17. 17. Geiger D, Verma T, Pearl J. d-separation: From theorems to algorithms. In: Proc. Fifth Workshop Uncertainty in Artificial Intelligence, Ontario, Canada. 1989. pp. 118-125
  18. 18. Schreiber J. Pomegranate: Fast and flexible probabilistic modeling in python. Journal of Machine Learning Research. 2018;18(164):1-6
  19. 19. bnlearn – Graphical structure of Bayesian networks [Online]. Available from: https://pypi.org/project/bnlearn
  20. 20. National Crime Agency. 2017. Missing Person Data Report 2015/2016. [Online]. Available from: http://missingpersons.police.uk/en-gb/resources/downloads/missing-person-statistical-bulletins
  21. 21. Uusitalo L. Advantages and challenges of Bayesian networks in environmental modelling. Ecological Modelling. 2007;203:312-318
  22. 22. Kyrimi E, Mossadegh S, Tai N, Marsh W. An incremental explanation of inference in Bayesian networks for increasing model trustworthiness and supporting clinical decision making. Artificial Intelligence in Medicine. 2020;103. ISSN 0933-3657
  23. 23. McLachlan S, Dube K, Hitman GA, Fenton NE, Kyrimi E. Bayesian networks in healthcare: Distribution by medical condition. Artificial Intelligence in Medicine. 2020;107, ISSN 0933-3657

Written By

Denis Reilly

Submitted: 18 October 2021 Reviewed: 25 April 2022 Published: 28 June 2022