Open access peer-reviewed chapter

Archetypes of Wildfire Arsonists: An Approach by Using Bayesian Networks

By Rosario Delgado, José-Luis González, Andrés Sotoca and Xavier- Andoni Tibau

Submitted: July 11th 2017Reviewed: November 21st 2017Published: May 9th 2018

DOI: 10.5772/intechopen.72615

Downloaded: 240

Abstract

Wildfires are a phenomenon of great importance because of their environmental and economic consequences, as well as the human losses they cause. The rate of resolution of arson-caused wildfires is extremely low when compared to other criminal activities. This fact highlights the importance of developing methodologies to assist investigators in the criminal profiling. For that we propose the use of Bayesian networks (BNs), which are a methodology belonging to the field of machine learning. BNs are probabilistic models that have only recently been applied to criminal profiling.We learn a BN model from real data of solved arson-caused wildfires in Spain, and after validation we use it to construct archetypes of the forest fires/arsonists with the aim of better understanding of this phenomenon and help in the task of identification of the culprits. We characterize five different archetypes around author motivation from a quantitative and objective point of view, which are in correspondence with the modes of operation in criminal activities of Shye.

Keywords

  • provoked wildfire
  • arsonist
  • archetype
  • profiling
  • Bayesian networks

1. Introduction

According to the Food and Agriculture Organization of the United Nations (FAO) survey [1], “[…] every year, wildfires destroy millions of hectares of forests, woodlands and other vegetation, causing the loss of many human and animal lives and immense economic damage, both in terms of resources destroyed and the costs of suppression. There are also impacts on society and the environment […]”. Mediterranean countries are especially sensitive to this phenomenon due to the characteristics of their vegetation, land use, and climate. On the average, 50,000fires burn 400,000hectares every year in these regions (San-Miguel-Ayanz, Moreno and Camia [2]), and the situation is worsening due to the effect of climate change (Turco, Llasat, von Hardenberg and Provenzale [3]). According to the Ministry of Agriculture and Fishery, Food and Environment of Spain [4], in period 2006–2015, a yearly average of 13,126forest fires burned 133,060hectares. As a consequence, this phenomenon is one of the major environmental problems in Spain.

In this work, we are interested in the arson-caused wildfire, understood as “the uncontrolled fire on forest land caused by humans that spreads quickly out control over woodland or brush, affecting vegetation that was not destined to burn” (this definition does not include the burning of stubble, grass, or scrub for the removal of forest residues, unless they are carried out where they are prohibited).

From a quantitative point of view, wildfires have been studied mainly from the point of view of risk assessment. Just to mention some studies, Thompson, Scott Helmbrechet and Calvin [5] present an integrated and systematic risk assessment framework to better manage wildfires and to mitigate losses to highly valued resources and assets, with application to an area in Montana, United States, while Penman, Bradstock and Price [6] study the patterns of wildfires in south-eastern Australia in relation to risk of ignition, and Adab, Kanniah and Solaimani [7] consider different fire risk indices in northeastern Iran. In the criminological context, Cozens and Christensen [8] analyze how environmental criminology can help to prevent arson-caused wildfires in Australia, where this phenomenon also represents a serious problem.

Although arson is one potential cause of many fires, yet the rate of clarification of arson-caused wildfires is extremely low when compared to other criminal activities. According to the interim report of the Ministry of Agriculture and Fishery, Food and Environment of Spain [9], 11,928wildfires were committed in 2015in Spain, of which 429offenders have been identified, representing a resolution rate of 6–6.5% since the estimated percentage of wildfires in Spain that were deemed arson in 2015ranges from 55to 60%. This fact highlights the difficulty in identifying the authors of provoked forest fires. Therefore, any help in developing methodologies that can aid investigators to better understand motivation of arsonists in order to solve and, if possible, to prevent these crimes, is welcome. In this sense, our main aim is to find predictive relationships between different typologies of forest fire and the characteristics of the perpetrators, by constructing archetypes taking into account both author features (behavioral, criminological, socio-demographic, and of personality) and evidences obtained from the fire, in order to assist people with responsibilities in the judicial investigation, increasing the rate of clarification of crimes and misdemeanors. Our work is framed into a project led by the Prosecution Office of Environment and Urbanism of Spain, which is carried out by a team in which members of the Crime Behavior Analysis Section of the Technical Unit of the Judicial Police of the Civil Guard participate.

Apart from some few descriptive studies as Soeiro and Guerra [10], to our knowledge the only quantitative approaches to this question stem from the works of Sotoca, González, Fernández, Kessel, Montesinos and Ruz [11] and Delgado, González, Sotoca and Tibau [12]. More specifically, the approach followed in Sotoca et al. [11] consists in the application of different techniques of statistical multivariate analysis (mainly, cluster analysis) to criminal profiling, based on the premise that the crime scene contains clues that if properly collected and interpreted, could say something about the person who set the fire. Otherwise, in Delgado et al. [12], the methodology of Bayesian networks (from now on, BNs) was applied for the first time to profiling of wildfire arsonists. BNs had only recently been applied to criminal profiling (see, for instance, Baumgartner, Ferrari and Palermo [13] and Baumgartner, Ferrari and Salfati [14]) and as far as we know, never before for profiling of any kind of arsonist.

The unpredictability of human behavior adds a component of randomness to all our activities, the criminal among them. BNs are an increasingly popular methodology in the field of machine learning for modeling uncertain in complex domains, and in the opinion of many Artificial Intelligence researchers, the most significant contribution in this area in the last years (Korb and Nicholson [15]). Indeed, BNs are of the most effective machine learning techniques and fall in the field of supervised learning, along with other techniques such as support vector machines, kernels, or neural networks.

BNs were introduced in the 1920s as a probabilistic tool to model the relationships among different variables. Usefulness of this methodology has been shown in many decision-making procedures and in different areas. In particular, it has been used with a great success in risk analysis in ecology (Ticehurst, Newham, Rissik, Letcher and Jakeman [16]), economy (Adusei-Poku [17]), emerging diseases (Walshe and Burgman [18]), environmental sciences (Borsuk, Stow and Reckhow [19] and Pollino, Woodberry, Nicholson, Korb and Hart [20]), medecine (Spiegelhalter [21], and Cruz-Ramrez, Acosta-Mesa, Carrillo-Calvet, Alonso Nava-Fernández and Barrientos-Martnez [22]), or nuclear waste accidents (Lee and Lee [23]). And with respect to criminology, for example, BNs have been introduced as a novel methodology for assessing the risk of recidivism of sex offenders in Delgado and Tibau [24].

Regarding wildfires, Papakosta and Straub [25] study a wildfire building damage consequences assessment system constructed from a BN, and applies it to spatial datasets from the Mediterranean island of Cyprus. Dlamini develops a BN model in [26] from satellite and geographic information systems (GIS), with variables of biotic, abiotic, and human kind, in order to determine factors that influence wildfire activity in Swaziland (see also Dlamini [27]). As mentioned above, Delgado et al. [12] is the only previous study on the use of BN for profiling of the author of a forest fire. The authors also implement this methodology for criminal profiling in an Internet computer application to be used by the Prosecution Office of Environment and Urbanism.1

In this chapter, we set two objectives: in the first place, we intend to introduce BN and explain their application to the study of profiles of forest arsonists. Secondly, we go beyond Delgado et al. [12] into the use of this methodology for a better understanding of wildfire arsonists motivation, constructing archetypes which will help to identify the culprits. For that, we learn a BN model from the updated available data provided by the Spanish government, and use it to study motivation and for the construction of archetypes from the characteristics of an arson-caused wildfire and offender features. Roughly speaking, we construct the most probable BN given the observed cases (learning procedure), and this model provides information on the relationships between the considered variables, which are both fire features and author characteristics, allowing us to carry out predictions about some of them (query variables) from other (evidences).

The organization of the chapter is as follows. In Section 2, we introduce the research methods we use, starting with an introduction to the theoretical framework that supports profiling and archetypes, a description of the dataset on which we rely to construct our BN model, and a description of the model itself. Complementary and more technical information of the latter topic can be found in Appendix A. In Section 3, we apply the previously constructed BN to develop archetypes for forest fires/arsonists based on motivation. The chapter finishes with a conclusion section.

2. Research methods

2.1. Theoretical framework

As the comprehensive literature review, Dowden, Bennell and Bloomfield [28] showed that most criminal profiling publications do not provide any clear theoretical framework on the rationale of the profiling process, and only a few articles reported the use of statistical techniques (most of them multivariate). For this reason, some authors criticize the use of profiling and call it “pseudoscientific practice”, as Snook, Cullen, Bennell, Taylor and Gendreau [29], while police officers see it with some skepticism (Snook, Haines, Taylor and Bennell [30]) and the mental health professionals of the forensic environment also show their doubts about it (Torres, Boccaccini and Miller [31]).

In the United Kingdom, however, scientific literature that overcome previous criticisms has been available for more than 20 years, and has led to a new methodological approach to profiling known as “Behavioral Investigative Advice”. This approach takes into account evidence-based knowledge to aid decision-making by the police investigator, and includes many other tasks such as crime scene assessment, case-link analysis, suspect prioritization matrices, counseling in the police interview, etc. (see Alison and Rainbow [32]). The origin of this new perspective began with the studies of Canter, in which multidimensional scaling was applied to datasets of solved crimes in order to obtain clusters or profiles, in the first place of the crimes themselves, and later of the authors, to finally calculate the statistical correlation with each other. In this way, depending on how the crime was committed, it could be assigned to a profile, which would automatically report the characteristics of the author who most often commits this type of crime. In addition, Canter offered a theoretical model that helped interpret the results: Shye’s model of action system (Shye [33]). This methodology was applied to the elaboration of profiles of arsonists (Canter and Fritzon [34]; Fritzon, Canter and Wilton [35]) and was continued in other works, such as Fritzon [36], in which it was applied to study the relationship between the distance traveled by the arsonists and their motivation; Kocsis and Cooksey [37], which is focused on serial arsonists; and Wachi, Watanabe, Yokota, Suzuki, Hoshino, Sato and Fujita [38], in which the incendiary women in Japan are studied.

However, in spite of so many antecedents, all these authors address the incendiary phenomenon in general, not the forest fire in particular. The only work specifically forestry previous to the studies carried out in Spain is the aforementioned Viegas and Soeiro [39] where, taking into account the model of action system and using multiple correspondence analysis, four profiles of forest arsonist in Portugal were proposed, denominated: “expressive with clinical history”, “expressive with attraction by the fire”, “vengeful instrumental”, and “instrumental to obtain profit”. Each of these profiles involves a series of identifying characteristics of its authors and a distinctive way of committing forest fires, depending on whether the main motivation was revenge, psychiatric problems, pathological attraction for fire, or obtaining an economic profit.

The work carried out in Spain Sotoca et al. [11] is inspired by the aforementioned Portuguese study and explores other data analysis methodologies, specifically techniques of multivariate statistical analysis, to establish an a priori classification of forest fires according to their cause or motivation, resulting in the following basic archetypes: “negligence”, which opposes “intentional”, being mutually exclusive. Intentional fires were grouped into four subtypes, also mutually exclusive: “profit’, “revenge”, “impulsive”, and “inadequate traditional practice”. This classification is consistent with the four modes of operation of the theoretical framework of criminal activities of Shye, and the correspondence among them is shown in Table 1.

Former forest fire classificationMode of operation
NegligenceAdaptive
Inadequate traditional practiceAdaptive
ImpulsiveIntegrative
ProfitExpressive
RevengeConservative

Table 1.

Equivalence between former classification given in Sotoca et al. [11] and mode of operation in Shye [33].

As in Delgado et al. [12], in this chapter, we consider a slight modification of the archetypes constructed in Sotoca et al. [11]: we stack “negligence” and “inadequate traditional practice” into “negligence”, since in both cases the fire occurs as a consequence of a recklessness, but distinguishing between “slight negligence” and “gross negligence”, depending on whether the perpetrator remains on site and helps extinguishing services, in the first case, or not. The rest of archetypes have not been modified. Then, the list of updated archetypes and their correspondence with modes of operation is given in Table 2. This is in line with the proposal of the five main profiles of forest fire from an “operational” character, each one with its own author profile, found in previous years and confirmed by the most recent statistical analysis carried out by the team working in this project. It is important to note that “impulsive”, “profit”, and mainly “revenge” are uncommon compared to the rest. Motivation has been recorded in 1,463of the 1,597solved cases in our database, and in Table 2 we show the percentages of each motivation type.

Present forest fire classification%Mode of operation
Slight negligence47.64Adaptive
Gross negligence31.30Adaptive
Impulsive10.05Integrative
Profit7.59Expressive
Revenge3.42Conservative
100.00

Table 2.

Equivalence between present classification and mode of operation in Shye [33].

In Delgado et al. [12] and in this chapter, the use of BN is proposed as an alternative to the analysis used in Sotoca et al. [11], since BNs allow to know not only if the way of committing a forest fire is associated with some characteristic of the author, but to quantify this association, which gives the fire investigator far more accurate information. BNs are a machine learning methodology of self-learning from the data that can be used with success in the social sciences, where efforts to find scientific laws on human behavior often fail to establish a conceptual framework to guide empirical observation and the method of analysis corresponding to that framework.

As mentioned in Section 1, our aim is to present BN as a methodology to improve understanding of the different types of motivations from a quantitative and objective point of view, helping in the construction of archetypes.

2.2. The dataset

Statistical information on the phenomenon of forest fires has been collected in Spain since 1968, generating one of the most complete databases in Europe and been pioneer worldwide. This information is currently managed by the General Directorate of Natural Environment and Forestry Policy of the Ministry of Agriculture and Fishery, Food and Environment of Spain. However, our database consists of policing clarified arson-caused wildfires (for which the alleged offenders have been identified), has been feeding since 2008 by the Secretary of State for Security throughout the entire Spanish territory, under the leadership of the Prosecution Office of Environment and Urbanism of the Spanish state, and contains information obtained from a specific questionnaire concerning authors that have been arrested or imputed.

As mentioned above, adding certain and supposed causes it seems that the percentage of wildfires in Spain that were intentional ranges from 55to 60%(close to other countries like Australia, Cozens and Christensen [8]), while it was only possible to identify 66.5%of the arsonists. Given these numbers, it could be said that the intentional forest fire is a criminal activity with very low rate of clarification, which explains the interest of the involved authorities and the society in general, in increasing the rate of clarification.

This subset conforms our dataset, which contains 1597 solved cases. According to the expert’s knowledge, n=25variables have been chosen of the total set of 32initial variables, because of their usefulness and predictive relevance. The choice is the result of a balance between the benefits of having a high number of variables (more realistic model with higher accuracy) and the drawbacks arising from the corresponding increasing complexity (implying the need for more data to learn the model properly). The chosen variables refer to crime (C1,,C10) and to the arsonist (A1,,A15), and are described in Table 3, where their possible outcomes are also shown. The incendiary variables A1,,A15correspond to aspects that are easily observable and have some police relevance, which is very convenient since they are intended to guide the police activity to clarify the crime. We use exclusively categorical variables, by discretizing the (few) continuous variables in the original database. Approximately 78%of cases have missing values in at least one of the variables, mostly variable authors, which are the ones that have the most missing cases. Because it is a very high percentage, instead of omitting cases containing at least one missing value, which is a standard practice, we replace missing values by a new value different from the rest of the outcomes (a “blank”, in our case), treating missing values as a unique value and not mapping them into any other. In this way we do not lose information. Once obtained the predictions for each query variable, the “blank” value is eliminated from prediction and its probability is proportionally divided among the rest of its outcomes.

VariablesOutcomes
C1=seasonSpring/winter/summer/autumn
C2=risk levelHigh/medium/low
C3=start timeMorning/afternoon/evening
C4=starting pointPathway/road/houses/crops/interior/forest track/others
C5=main use of burned surfaceAgricultural/forestry/livestock/interface/recreational
C6=number of seatsOne/more
C7=related offenseYes/no
C8=patternYes/no
C9=tracesYes/no
C10=who denouncesGuard/particular/vigilance
A1=age34/3545/4660/>60
A2=way of livingParents/in couple/single/others
A3=kind of jobHandwork/qualified
A4=employment statusEmployee/unemployed/sporadic/retired
A5=educational levelIlliterate/elementary/middle/upper
A6=income levelHigh/medium/low/without incomes
A7=sociabilityYes/no
A8=prior criminal recordYes/no
A9=history of substance abuseYes/no
A10=history of psychological problemsYes/no
A11=stays in the sceneNo/remains there/remains and gives aid
A12=distance home-sceneShort/medium/long/very long
A13=displacement meansOn foot/ by car/all terrain/others
A14=residence typeVillage/house/city/town
A15=motivationSlight negligence/gross negligence/impulsive/profit/revenge

Table 3.

Outcomes of the variables in the dataset.

2.3. Constructing the BN

BNs are graphical structures for representing the probabilistic relationships among the variables describing a random phenomenon, such as in our setting provoked forest fires, and for performing probabilistic inference with them. Given a set of random variables V=X1Xn, a BN is a model that represents the joint probability distribution Pover those variables. In our case, V=C1C10A1A15and n=25. The graphical representation of the BN consists of a directed acyclic graph (DAG), whose nnodes represent the random variables (from now on, we identify a node with the variable that represents). The directed arcs among the nodes represent conditional dependencies between variables. Figure 1 shows the DAG corresponding to the BN that has been constructed (learned from data).

Figure 1.

Learned structure (DAG) of the BN from the dataset.

We can use the BN to help in characterizing a provoked wildfire in terms of the relationships between different variables. These relationships are expressed in a very simple way in the BN, through the absence/presence of directed arcs in its DAG, taking into account the Markov condition, which stays the following: “knowing the values that its parents take, which are the nodes sending a directed arc to it in the DAG, any variable is independent of any other which is not a parent nor a descendant of it (a “descendant” of a node is any other node to which is possible to arrive from it by following a path linking directed arcs)”. For example, observing Figure 1 we can see that known the value of variable A15, C4is independent of any other variable except C5, since C5is its unique descendant. Just to mention another example, if we know the outcome of variable A8, then A12is independent of the rest of variables except A13and A14.

Once learned the BN model from the dataset, both the structure (DAG) and the parameters (the probability distribution of each variable conditioned to its parents), we can use it to compute any a posteriori probability we are interested in: we can consider an evidence concerning some variables of the model and use the BN to update the (a priori) probability distribution of any of the rest of variables, knowing the evidence. More specifically, from an evidence of the form E=Xi1=xi1Xi=xi, where Xi1XiVare the evidence variables, we could be interested in computing the a posteriori (conditioned) probability PXj1=xj1Xjs=xjs/Ewith Xj1XjsV\Xi1Xithe set of query variables. This probability is the update when we noticed and additional piece of knowledge, of the corresponding a priori probability, which would be the same but without conditioning with respect evidence E. Given an evidence E, the prediction of the query variable Xis chosen to be the instantiation of Xthat maximizes the a posteriori probability. In a more formal way, if x1,,xrare the possible instantiations of X, then x=argmaxk=1,,rPX=xk/Eis the prediction for Xknowing evidence E, and PX=x/Eis said to be the confidence level (CL) of the prediction. We will apply this procedure to our setting in the following way: given an evidence in terms of the crime (evidence) variables for a given provoked forest fire, we will predict the value of the query arsonist features (query variables), which form the predicted profile of the arsonist. Interested readers can find technical details about the construction and validation of the BN in Appendix A.

All calculations, as well as the process of model construction, validation, and inference, have been carried out with R, which is “GNU S”, a freely available language and environment for statistical computing and graphics, which provides a wide variety of statistical and graphical techniques. It can be obtained from the CRAN sitehttps://cran.r-project.org/. Different packages of R has been adopted:

  • bnlearn: Bayesian Network Structure Learning, Parameter Learning and Inference, by Marco Scutari and Robert Ness,http://www.bnlearn.com/

    We use this package for Bayesian network structure learning and parameter learning, using the score-based Hill-Climbing structure learning algorithm and maximum likelihood parameter estimation, respectively.

  • gRain: Graphical Independence Networks, by Søren Højsgaard,http://people.math.aau.dk/ sorenh/software/

    We use this package for making inference by probability propagation with the BN learned by using the bnlearn package.

  • sna: Tools for Social Network Analysis, by Carter T. Butts,http://www.statnet.org.

    From this package, we use some social network analysis measures in Section 3.1.

3. Archetypes

In this section we use the BN model learned from the dataset and described in Section 2, to construct forest fire archetypes related to arsonist motivation.

First of all, note that author variables A8=“prior criminal record”, A9=“history of substance abuse”, and A10=“history of psychological problems” are operative variables of practical use so that the investigators can identify the author of a provoked fire. Fortunately, these variables have a good accuracy in prediction with the BN model, higher than 80%. See Table 4, where accuracies, both individual for the prediction of each author variable (IPA), as well as overall (OPA), are consigned.

3.1. Why motivation?

We use motivation (A15) as a cornerstone from which to construct the archetypes by two reasons: (1) from a viewpoint of the theoretical framework, motivation plays a key role in criminological investigations (see Collin [40]), and as explained in Section 2.1, in order to meet Shye’s classification for criminal activities, motivation should be taken as classification criterion, and (2) A15is the author variable with the most central role and explanatory capacity of fire characteristics in the model.

Indeed, 8of the 10nodes representing crime variables are directly related to it. More concretely, 7of them are descendants (from C3to C9, being all of them “sons” of A15, except C5, which is a “grandson”), and one, C10, is a “brother”, that is, it is a son of the father of A15, which is A11(see Figure 1). The main role of A15in the model can be quantified by using centrality and/or betweenness measures borrowed from the Network Analysis area. In Graph Theory and Network Analysis, indicators of centrality identify the most important nodes within a graph. Here, “importance” is conceived as involvement in the cohesiveness of the network. Applications of centrality include identifying the most influential person(s) in a social network, key infrastructure nodes in the Internet or urban networks, and super-spreaders of a disease. Concretely, for each author variable we computed two measures, which are shown in Table 5, both normalized in order to sum up 100:

  1. Freeman’s degree of centrality (Freeman [41]), which counts paths which pass through each node, that is, directed arcs which arrive at or depart from it. Table 5 points out A15as the author variable with the most central role, doubling the value of the following in the ranking.

  2. Borgatti and Everett’s betweenness measure (Borgatti and Everett [42]). Betweenness quantifies the number of times a node acts as a “bridge” along the shortest path between two other nodes (which we will call “geodesic” from now on). Nodes that have a high probability to occur on a randomly chosen geodesic between two randomly chosen nodes, have a high betweenness. Borgatti and Everett’s betweenness is a modification of a basic standard betweenness measure, which was defined for a node vas i,jnodesgivjgij(with the convention 0/0=0), where gijis the number of geodesics from ito jin the graph, and givjis the number of geodesics in the subset of those that pass through v. The modification proposed by Borgatti and Everett is as follows:

    i,jnodes1dijgivjgijE1

where dijis the geodesic distance from ito j(that is, the number of directed arcs that compose any geodesic from ito j). Conceptually, using the basic standard betweenness measure, high-betweenness nodes lie on a large number of non-redundant geodesics between other nodes; they can thus be thought of as “bridges”. Borgatti and Everett’s betweenness adjusts the basic standard by down-weighting long geodesics, and attending to it we see in Table 5 that A15is the second most important after, but very close, to A9.

Author variableIPA (%)
A1=age33.13
A2=way of living60.40
A3=kind of job72.28
A4=employment status44.62
A5=educational level46.24
A6=income level46.96
A7=sociability97.02
A8=prior criminal record80.19
A9=history of substance abuse90.58
A10=history of psychological problems89.56
A11=stays in the scene60.49
A12=distance home-scene45.13
A13=displacement means34.38
A14=residence type47.12
A15=motivation56.36
Total OPA (%).58.12

Table 4.

Individual predictive accuracy (IPA) and overall predictive accuracy (OPA).

AuthorFreeman’sBorgatti and Everett’s
VariablesCentrality (%)Betweenness (%)
A1=age6.679.36
A2=way of living4.448.17
A3=kind of job4.440.00
A4=employment status6.672.90
A5=educational level8.889.74
A6=income level4.443.07
A7=sociability8.8811.36
A8=prior criminal record6.6712.68
A9=history of substance abuse8.8814.94
A10=history of psychological problems6.673.33
A11=stays in the scene4.440.00
A12=distance home-scene6.679.72
A13=displacement means2.230.00
A14=residence type2.230.00
A15=motive17.7714.68
100.00100.00

Table 5.

(Normalized) Freeman’s degree of centrality and Borgatti and Everett’s betweenness measure of the author variables.

3.2. Constructing archetypes

The explained above justifies the decision to base our archetypes of provoked forest fires on A15. Therefore, we construct some archetypes around motivation, and comparing them with that in Sotoca et al. [11], we see they are consistent. To carry this out, we predict query variables C1,,C10,A1,,A14by introducing as evidence the different possible outcomes of variable A15. Some of the crime variables, and most of the author variables are insensitive, that is, they coincide for the consigned five possible criminal motivations, and for any of them always have the same predicted values, which are collected in Table 6.

VariablePredicted value
C1=seasonSpring
C2=risk levelHigh
C6=number of seatsOne
C7=related offenseNo
C9=tracesNo
C10=who denouncesParticular
A1=age4660
A2=way of livingIn couple
A3=kind of jobHandwork
A4=employment statusEmployee
A5=educational levelElementary
A6=income levelMedium
A7=sociabilityYes
A8=prior criminal recordNo
A10=history of psychological problemsNo
A12=distance home-sceneMedium
A14=residence typeTown

Table 6.

Common predicted values for all the five archetypes.

In case of C1and C2, it is not surprising since, as can be seen in Figure 1, they are not related neither with A15nor with any other variable in our model. Coinciding with common sense, for each of these two variables the most probable value is chosen, independently of the evidence variable A15.

Explanation for each of the variables appearing in Table 6 that are sons of A15, which are C6,C7, and C9, is straightforward: we just have to have a look at its conditional probability table (CPT so on), whose values are parameters of the BN model that have been learned from data when constructing it, and observe that conditioned to the different outcomes of A15, the most probable value of any of them does not vary. Simply to illustrate, Table 7 is the conditional probability table of variable C6conditioned to A15. The maximum probability corresponds to the same row when we vary from one column to another, that is, conditioned to any of the possible outcome of A15the prediction of our model for C6is always “one”.

C6A15Slight negligenceGross negligenceImpulsiveProfitRevenge
More0.060.100.330.410.37
One0.940.900.670.590.63
1.001.001.001.001.00

Table 7.

Conditional probability table (CPT) of C6to A15. For example, if A15were “slight negligence”, the estimated probability for C6=“one” is 0.94, that is, PC6=one/A15=slightnegligence=0.94,which is the maximum value of its column, being then “one” the prediction for C6conditioned to A15=“slight negligence”.

For the rest of variables in Table 6, intuition is no longer reliable since their relation with A15is modeled through a chain of oriented arcs (a path). We can say that, in general, the longer the path linking them, the lesser the mutual influence is between two nodes, which would explain the presence of the author variables in Table 6.

Variables not appearing in Table 6 take different values according to motivation, as Table 8 shows, and they are those from which we will describe our archetypes. Of the crime variables, C3,C4, and C8are sons of A15, while C5is a grandson. CPTs of C3,C4, and C8conditioned to A15, whose values are parameters of the model which are learned from data, give a straightforward prediction for each of these variables, which is the most likely predicted value conditioned to each motivation type. With C5and A13we have to be more cautious. It is recommended to the interested readers to delve into this aspect, to consult Appendix B.

NegligenceIntentional
Variables Archetypes (1) Slight negli.(2) Gross negli.(3) Impulsive(4) Profit(5) Revenge
C3=start timeAfternoonAfternoonAfternoonAfternoonEvening
C4=starting pointCropsCropsPathwayPathwayPathway
C5=main use surfaceAgriculturalAgriculturalForestryForestry*Forestry
C8=patternNoNoYesYesNo
A9=history subst. AbuseNoNoNoNoYes
A11=stays in the sceneGives aidNoNoNoNo
A13=displacement meansBy carBy carOn footBy carOn foot

Table 8.

Specific predicted values for each of the five archetypes (extended version).

the second most likely outcome, “agricultural”, has a very close probability to that of “forestry”, as can be seen in Table 15, Appendix B.


3.3. Checking, improving, and reducing archetypes

It seems convenient to check the constructed archetypes given in Table 8, and we will carry it out as follows. We could ask if using as evidence the values of the variables in Table 8 for each of the archetypes, and as query variable motivation, the model will predict the concordant archetype. If so, the archetype would be strengthened and would, in a certain sense, be validated. But it may not happen, because we do not obtain the same probabilities conditioning C4by A15, for example, that vice versa. Indeed, to exemplify this fact, we set specific values for these variables, say “pathway” and “impulsive”, respectively, and we will see that

PC4=pathway/A15=impulsivePA15=impulsive/C4=pathway.E2

The reason appears clearly when using Bayes’ Theorem we relate these two probabilities:

PA15=impulsive/C4=pathway=PC4=pathway/A15=impulsivePA15=impulsivePC4=pathway,E3
That is, these probabilities are related by means of the multiplicative factor

PA15=impulsivePC4=pathway0.10050.14900.6745E4

in this way:

PA15=impulsive/C4=pathway=0.6745×PC4=pathway/A15=impulsive.E5

Table 9 shows the CPT of A15to the evidences given by the values of variables in Table 8. The predicted (most likely) value for A15appears in boldface

ArchetypeEvidence variablesValueConditioned distrib. of A15to evidence
(1)C3=wildfire start timeAfternoonProfit (0.25%)
C4=starting pointCropsGross negligence (0.00%)
C5=main use of surfaceAgriculturalSlight negligence (99.72%)
C8=patternNoImpulsive (0.03%)
A9=tracesNoRevenge (0.00%)
A11=stay in the sceneGives aid
A13=displacement meansBy car
(2)C3=wildfire start timeAfternoonProfit (2.79%)
C4=starting pointCropsGross negligence (96.75%)
C5=main use of surfaceAgriculturalSlight negligence (0.00%)
C8=patternNoImpulsive (0.31%)
A9=tracesNoRevenge (0.15%)
A11=stay in the sceneNo
A13=displacement meansBy car
(3)C3=wildfire start timeAfternoonProfit (39.39%)
C4=starting pointPathwayGross negligence (22.02%)
C5=main use of surfaceForestrySlight negligence (0.00%)
C8=patternYesImpulsive (32.85%)
A9=tracesNoRevenge (5.74%)
A11=stay in the sceneNo
A13=displacement meansOn foot(It should be Impulsive)
(4)C3=wildfire start timeAfternoonProfit (39.39%)
C4=starting pointPathwayGross negligence (22.02,%)
C5=main use of surfaceForestry/agriculturalSlight negligence (0.00%)
C8=patternYesImpulsive (32.85%)
A9=tracesNoRevenge (5.74%)
A11=stay in the sceneNo
A13=displacement meansBy car
(5)C3=wildfire start timeEveningProfit (4.56%)
C4=starting pointPathwayGross negligence (5.15,%)
C5=main use of surfaceForestrySlight negligence (0.00%)
C8=patternnoImpulsive (24.52%)
A9=tracesYesRevenge (65.77%)
A11=stay in the sceneNo
A13=displacement meansOn foot

Table 9.

Checking archetypes given in Table 8.

Looking at Table 8, we note that the only difference between the archetypes impulsive and profit is given by A13. Will this difference propagate to A15? Table 9 tells no, since the conditional probability tables of A15for the corresponding evidences match, and we see that impulsive is the only archetype given by Table 8 that has not been confirmed by Table 9. Could we modify this archetype in some sense to better adapt to data and result in an improved version? Actually yes.

Let us go back for a moment to Table 8. Given an evidence as, for example, A15=“profit”, we predict query variables appearing in the table (and the rest as well) as if they were independents. This assumption make the calculations for predictions feasible, since if this assumption were not made, calculations would be so large that they would easily overflow the calculating capacity of a personal computer. But is it realistic? By the Markov condition, given A15known, the independency among variables appearing in Table 8 can be assumed (approximately in case of A9and A13, because although A13is a descendant of A9, the length of the geodesic that connects them weakens dependency) except in one case: C4and C5. Fortunately, it is feasible to carry on the calculations to obtain the joint probability distribution of C4and C5conditioned to A15, and making the joint prediction of both (that is, taking the values that maximize this joint distribution), this prediction improves that made separately assuming an independence that is far from certain. For example, conditioned to A15=“impulsive”, the combination of values of C4and C5that maximizes the joint probability distribution is: C4=“road” and C5=“forestry”. By replacing C4=“pathway” by C4=“road” in archetype (3) of Table 9, we obtain the conditioned distribution of A15to the evidence given by the evidence variables in Table 10.

Modified archetypeEvidence variablesValueConditioned distrib. of A15to evidences
(3)C3=wildfire start timeAfternoonProfit (25.62%)
C4=starting pointRoadGross negligence (13.22%)
C5=main use of surfaceForestrySlight negligence (0.00%)
C8=patternYesImpulsive (54.32%)
A9=tracesNoRevenge (6.84%)
A11=stay in the sceneNo
A13=displacement meansOn foot

Table 10.

Checking modified archetype impulsive.

For the rest of archetypes, the joint predictions of C4and C5are exactly the same as the separated ones assuming independency, except for revenge. In this case, the joint prediction is C4=“forest track” and C5=“forestry”. If substitute C4=“pathway” by C4=“forest track” while maintaining C5=“forestry” in Table 9, archetype (5), the probability of predict revenge increases from 65.77to 76.45%.

Finally, for each archetype we can eliminate some of the variables without a great loss, those that are superfluous in the sense that if we do not include them as part of the evidence, the conditioned probability of A15does not change excessively, maintaining the same prediction (value that maximizes probability). The improved and reduced version of the archetypes are given in Table 11. Naturally, the archetypes with the highest confidence level (CL) are those that correspond to both types of negligence, which are the most frequently consigned motivations in the dataset. We summarize the main distinctive features of each archetype:

  • Negligence is characterized because the starting point of the fire is crops, and the main use of the burned surface is agricultural. The only difference between slight and gross negligence is that in the first case arsonist stays at the scene and gives aid while in the second he does not. This is consistent with intuition, given that these type of fires are mainly accidentally caused by farmers.

  • Impulsive is characterized by the starting point of the fire, which is a road, and the main use of the burned surface, which is forestry. As for profit, there is a pattern of action of the incendiary in the criminal activity. In this case, the arson has no specific objective beyond the arsonist momentum, so the forest is usually burned but not other types of surfaces. A road as starting point of the fire is characteristic in this archetype because it is a fast escape route after causing the fire.

  • Profit is mainly characterized because the starting point of the fire is a pathway, and there is no history of substance abuse by the arsonist, which is logical from the point of view that, contrary to the previous archetypes, this type of wildfires are premeditated. The existence of a pattern of action is shared with impulsive.

  • Revenge is the only archetype in which wildfire start time matters, and it occurs in the evening. Moreover, it is just the opposite as profit in the sense that for this archetype, there is no pattern of action but the author does have a history of substance abuse. This would tell us that usually this type of provoked forest fire is not the consequence of deliberate action, rather, it is carried out by a person under the effects of drugs and who could be swayed by an impulsive feeling of rage.

ArchetypeEvidence variablesValueConditioned distr. of A15CL
(1) Slight negl.Profit (0.87%)98.92%
C4=starting pointCropsGross negligence (0.00%)
C5=main use of surfaceAgriculturalSlight negligence (98.92%)
A11=stay in the sceneGives aidImpulsive (0.19%)
Revenge (0.02%)
(2) Gross negl.Profit (6.78%)90.98%
C4=starting pointCropsGross negligence (90.98%)
C5=main use of surfaceAgriculturalSlight negligence (0.00%)
A11=stay in the sceneNoImpulsive (1.69%)
Revenge (0.56%)
(3) ImpulsiveProfit (16.39%)59.32%
C4=starting pointRoadGross negligence (5.60%)
C5=main use of surfaceForestrySlight negligence (9.23%)
C8=patternYesImpulsive (59.32%)
Revenge (9.46%)
(4) ProfitProfit (32.63%)32.63%
C4=starting pointPathwayGross negligence (10.75,%)
C8=patternYesSlight negligence (19.49%)
A9=history subst. abuseNoImpulsive (31.79%)
Revenge (5.34%)
(5) RevengeProfit (5.22%)51.81%
C3=wildfire start timeEveningGross negligence (11.03,%)
C8=patternNoSlight negligence (4.10%)
A9=history subst. abuseYesImpulsive (27.84%)
Revenge (51.81%)

Table 11.

Final archetypes: an improved and reduced version. The confidence level (CL) for each archetype is the probability of the outcome predicted for A15.

4. Conclusion

By using an ad hoc BN model learned from a dataset, we construct five archetypes for provoked forest fires. These archetypes are structured from arsonist motivation, which is the most central author variable in the model and plays an important role in psychological criminology, in accordance with the modes of operation in criminal activities of Shye’s model of action system [33]. We see that the constructed model from the dataset of solved provoked Spanish forest fires conforms to this theoretical model. Two archetypes correspond to the mode of operation adaptive: slight negligence and gross negligence, which are distinguished in that while for the first the author stays at the crime scene and helps firefighting equipment, for the second he does not. The rest of archetypes are impulsive, profit and revenge, and correspond respectively to the modes of operation integrative, expressive and conservative.

In addition, we obtain a ratification of the five archetypes introduced in Sotoca et al. [11] in general terms, but with some specificities obtained thanks to the great potentiality of the used methodology. Indeed, the constructed BN models the relationships of dependency between the different variables (features of the wildfire and characteristics of the arsonist, including motivation), and it is precisely the understanding of these dependencies that allows to obtain predictions about some variables (queries) from others (evidences), without having to give up to take into account the complex relations that exist among them. As a matter of fact, the BN model captures these complexity and use it in an efficient way.

The specificities of each archetype are given by the values of a reduced set of variables that characterize each one, as stated in Table 11, where the confidence level or each archetype, which is the probability of the prediction given the corresponding set of evidences, is also consigned. As expected, the best results in terms of the predictive capacity of the model correspond to both types of negligence, which are the most common consigned motivations in the dataset, far ahead of the other three archetypes, much less frequent.

With this work we hope to highlight the usefulness of BN as an objective and quantitative methodology to obtain valuable information from the dataset, and its applicability in the study of criminal motivation and behavior in general and, in particular, of forest arsonists, helping to identify the authors and to study this phenomenon, so complex and with such serious consequences for the environment.

Acknowledgments

The authors wish to express their acknowledgment to the Prosecution Office of Environment and Urbanism and to the Secretary of State for Security of the Spanish state for providing data and promote research.

R. Delgado and X.A. Tibau are supported by Ministerio de Economa y Competitividad, Gobierno de España, project ref. MTM2015 67802-P (MINECO/FEDER, UE).

A.1. Learning the BN

For the learning process of the BN we adopt the score-based structure learning method (“Greedy search-and-score”), which is an algorithm that attempts to find the structure that maximizes the score function. We choose, as usual, the Bayesian Information Criterion (BIC) as score function, since it is intuitively appealing because contains a term that shows how well the model predicts the observed data when the parameter set is equal to its MLE estimation, which is the log-likelihood function, and a term that punishes for model complexity. This algorithm searches through the space of possible structures of the network; in each step, it considers the addition, elimination, or the reverse of an arc, given the structure of the previous step (with the constraint that the resultant graph be acyclic), and “greedily” choose the option that maximizes the score function, stopping when no increase is possible. In order to compute the score of the model in each step, this algorithm only needs to recompute few scores from the previous step (local scoring updating), which represents a huge calculation advantage. The problem with this algorithm is that we could obtain a solution that is a local (but not global) maximum of the score function. For that, we use the “iterated hill-climbing” algorithm, which carries out a local search until a local maximum is obtained, randomly perturbing it for then repeat the process. Finally, the maximum over local maxima is used as a better approximation of the global maximum.

A.2. Validation

We perform a cross-validation procedure, which is a technique for assessing how the BN model performs in the sense of correctly predicting a query variable (author variable) from an evidence given in terms of the variables of an independent (future) wildfire. That is, we want to estimate the accuracy in prediction in practice using our model. Concretely, we use leave-one-out cross-validation. Each round of the cross-validation procedure involves choosing a case (one different every time) and learn the corresponding BN model from the training set which is the complementary of the choosing case in the dataset, which is then used to validate the BN model. Indeed, for that case, we use as evidence the values of the crime variables C1,,C10in order to predict each of the query variables A1,,A15, and take note of the matches between predictions and real values of these variables in the case. We perform, then, N=1597rounds of the cross-validation, one for each of the cases in the dataset. We take into account the matches over the Nrounds in combination in order to estimate predictive accuracy for each of the author variables individually (“IPA” Individual Predictive Accuracy values), as well as globally (“OPA” Overall Predictive Accuracy value).

For each query variable, the IPA value is obtained by dividing the number of correct predictions by the total number of predictions (excluding blanks). The OPA value is obtained by dividing the total number of matches (10,543) by the total number of predictions (excluding blanks), which is 18,141. The result shows an OPA of 58.12%, that is, the 58.10%of times we predict correctly an offender characteristic. Note that in total n×N=15×1,597=23,955is the number of predictions (number of variables that are predicted multiplied by the number of cases in the dataset), but only 18,141of them are recorded, which are those in which the corresponding author variable outcome was not a missing value. Of these, 10,543match and the rest do not. Both the IPA and OPA values are recorded in Table 4.

From this table we can see which are the wildfire arsonist characteristics that are typically correctly predicted (IPA70%): A3,A7,A8,A9, and A10. Note that all the author variables are predicted correctly more often than simply by chance, taking into account the number of levels of each one. Then, they can be used to narrow the list of suspects in an unsolved wildfire. It should also be borne in mind that, as predictions are made with our model, we choose as prediction for a variable the outcome that maximizes the probability, causing failures in prediction when the second most likely outcome has a probability close to the first one, what is really happening with some of the variables, making the accuracy not as high as would be desirable.

Finally, we also compute the “DIPA” (Disincorporate Individual Predictive Accuracy), which is the percentage of correct predictions, for each author variable, according to the prediction that we made for it from the evidence given by the crime variables. For example, for A15, the IPA (accuracy rate) is 56.36%. If the prediction for A15were “slight negligence”, what happens 60.38%of the times, then accuracy rate would be 61.29%, as consigned in Table 12, while if the prediction for A15were “revenge”, what instead happens only 0.75%of the times, this rate plummets to 20.00%. We note that the most popular prediction for A15is “slight negligence”, which is the type of motivation with which prediction is most accurate. At the opposite end, the less popular prediction is “revenge”, which is the type of motivation with the less accurate prediction.

If prediction for A15were…%Accuracy in predicting A15(DIPA)
Slight negligence(60.38%)61.29
Gross negligence(22.51%)47.67
Impulsive(9.90%)53.33
Profit(6.46%)41.05
Revenge(0.75%)20.00

Table 12.

Disincorporate Individual Predictive Accuracy (DIPA) for A15. For each outcome of A15, the percentage of times that the prediction for A15is that value is consigned in parentheses.

A.3. The final model

The final BN model is that obtained learning from the whole dataset with N=1597cases, after validation process. The corresponding structure is that given by the DAG in Figure 1.

It is known that the performance of the algorithms used for learning BN is unsatisfactory if the database set does not have a sufficiently high number of cases. When can we say that the number of cases is big enough? It depends on the number of nodes and on the size of their domain, which is the set of different possible instantiations of the set formed by all the nodes. Both, number of nodes and size of their domain, are known in practice. But the sufficiency of the number of cases also depends on the underlying probability distribution, which a priori used to be unknown.

Are our N=1597cases sufficient to learn the BN model? In order to study this issue, we generate subset samples of size ranging from m=25to m=Nin increments of 5, at random, and from each one we learn the model and compute the BIC score function. Then, we plot the BIC score as a function of the size of the subset sample (see Figure 2). In this case, before attaining Na saturation point is reached (approximately at 1250), from which the BIC score does not improve significantly by increasing the size of the subset sample. As a consequence, we can say that it does seem the number of cases of the database set is big enough to learn the BN.

Figure 2.

Evolution of the BIC score function as the number of training cases, m, increase to N.

In Section 3.2, we have discussed the main idea in constructing archetypes by illustrating it with a simple example. There we mentioned that it was very important to be cautions applying intuition since otherwise, we could naively make the following erroneous reasoning: since the prediction for C4is “crops” if A15is any type of negligence, and “pathway” for the rest of values of A15, as can be seen in Table 13, and since prediction for C5in both cases is “agricultural” (Table 14), then the prediction for C5would be the same, “agricultural”, independently of the motivation. Actually this is not so. Indeed, since the geodesic joining A15and C5has distance 2, passing through the only one intermediate node C4, we can easily compute the probability of each value of C5conditioned to A15from the CPT of C5conditioned to C4(Table 14), and that of C4conditioned to A15(Table 13).

C4A15Slight negligenceGross negligenceImpulsiveProfitRevenge
Pathway0.100.110.310.290.33
Road0.040.040.290.110.22
Houses0.070.050.020.010.04
Crops0.380.350.030.140.02
Interior0.150.160.090.160.04
Forest track0.050.070.140.160.27
Others0.210.220.120.130.08
1.001.001.001.001.00

Table 13.

Conditional probability table (CPT) of C4to A15.

C5C4PathwayRoadHousesCropsInteriorForest trackOthers
Agricultural0.380.180.300.750.200.110.22
Forestry0.210.450.170.120.480.510.30
Livestock0.290.180.090.110.260.280.27
Interface0.070.180.340.010.030.010.12
Recreational0.050.010.100.010.030.090.09
1.001.001.001.001.001.001.00

Table 14.

Conditional probability table (CPT) of C5to C4.

In this simple case it is possible to show the calculations, and we do it “by hand” to exemplify the procedure. For example, we can compute PC5=agricultural/A15=slightnegligenceby using the Conditioned Law of Total Probability in the following way, by conditioning to all the possible outcomes of C4:

PC5=agricultural/A15=slightnegligence=
PC5=agricultural/C4=pathwayPC4=pathway/A15=slightnegligence+
PC5=agricultural/C4=roadPC4=road/A15=slightnegligence+
PC5=agricultural/C4=housesPC4=houses/A15=slightnegligence+
PC5=agricultural/C4=cropsPC4=crops/A15=slightnegligence+
PC5=agricultural/C4=interiorPC4=interior/A15=slightnegligence+
PC5=agricultural/C4=foresttrackPC4=foresttrack/A15=slightnegligence+ PC5=agricultural/C4=othersPC4=others/A15=slightnegligence=
0.38×0.10+0.18×0.04+0.30×0.07+0.75×0.38+0.20×0.15+0.11×0.05+0.22×0.210.43E6

Similarly, we can find the rest of conditioned probabilities and write the CPT of C5conditioned to A15(Table 15), which coincides with the product of matrices given by Tables 14 and 13, in this order. The highest probability in each column is in boldface and corresponds to the prediction for C5given each evidence in terms of A15, as stated in Table 8. We can see that the prediction for C5is “agricultural” only if motivation is “negligence” (either slight or gross), being “forestry” otherwise.

C5A15Slight negligenceGross negligenceImpulsiveProfitRevenge
Agricultural0.430.420.260.320.25
Forestry0.260.270.350.330.36
Livestock0.190.200.240.240.25
Interface0.070.070.100.060.09
Recreational0.050.040.050.050.05
1.001.001.001.001.00

Table 15.

Conditional probability table (CPT) of C5to A15computed by using the conditional law of total probability.

On the other hand, for A13the dependency chaining is more subtle and much more harder to follow by hand, so we give up on it and only carry out predictions by using the BN model with R.

Notes

  • Delgado R, Tibau XA. “PerfilNet.Pyros: Expert System based on Bayesian networks for the prediction of criminal profiles in forest fires”. Register on June 10, 2016 of authorship at the “Benelux Office for Intellectual Property” (BOIP), with reference number i-depot number: 088029.

© 2018 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Rosario Delgado, José-Luis González, Andrés Sotoca and Xavier- Andoni Tibau (May 9th 2018). Archetypes of Wildfire Arsonists: An Approach by Using Bayesian Networks, Forest Fire, Janusz Szmyt, IntechOpen, DOI: 10.5772/intechopen.72615. Available from:

chapter statistics

240total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Forest Fire Monitoring

By Ahmad AA Alkhatib

Related Book

First chapter

Deforestation: Causes, Effects and Control Strategies

By Sumit Chakravarty, S. K. Ghosh, C. P. Suresh, A. N. Dey and Gopal Shukla

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us