Although the study of complex systems has increased significantly in recent years, it remains a great challenge for science, both theoretically and practically. The complexity is based on the fundamental idea that a system is different from the sum of its parts. In the real world, we observe complex phenomena that greatly influence society and/or the environment. As a result, several approaches, more and more sophisticated, were designed to model and simulate complex systems. We need to find ways to understand these phenomena, especially if we have to take actions in order to limit their damage or increase their benefits. Indeed, modelling and computer simulation are used to virtually reproduce one or several phenomena in order to study them. A computer simulation consists in designing models, implementing these models and analyzing the results of their execution (Fishwick, 1995). In addition, the simulation can be used to develop tools for decision support. We are particularly interested in using modelling and computer simulation to help public health policy makers to better understand the spread of infectious diseases. These diseases are the result of the transmission of a pathogen (e.g. virus, bacteria) from an infected individual ("host": human or animal) to a healthy individual. Moreover, the expansion of some zoonoses (diseases transmitted from animal to human) such as the West Nile virus (WNV) forced public health authorities to develop monitoring systems. These systems brought together field data on human and animal infection (Gosselin et al., 2005). While these monitoring activities were undertaken to better understand the epidemiology of the disease and the level of risk it can represent for the human population, they do not allow for forecasts of the probable propagation of the zoonosis on the territory. Such a forecast, if it proved to be reliable, would allow public health authorities to initiate preventive actions at the right time and places and at the appropriate level of expected risk. However, it remains difficult to determine the at-risk areas on a scientific basis and the efficacy of such measures has been challenged (Ruiz et al., 2004), not to mention their high cost and environmental impacts. The identification of vulnerable zones and risk levels in due time remains a significant challenge for public health management due to the complexity of the phenomena related to the disease transmission.
Several approaches have been proposed to model and simulate the spread of infectious diseases. However, these approaches such as mathematical modelling, cellular automata and traditional multi-agent systems have some weaknesses when trying to model and simulate the influence of geographic and climatic features on the disease spread and the spatio-temporal interactions of various kinds of actors (i.e. mosquitoes, birds, mammals and humans in the WNV case). Indeed, the simulation based on mathematical models that generally uses differential equations (Bowman et al., 2005) does not take into consideration the geographical space in which populations operate, except in certain cases such as patchy models (Liu et al., 2006). In spite of the fact that a simulation based on cellular automata models the evolution of the spatial characteristics of a geographic area involved in the disease, it does not represent individuals and their mobility (White et al., 2009). On the other hand, traditional agent-based simulations of epidemics represent the disease vectors (e.g. animals) as agents, but usually do not take advantage of data provided by Geographic Information Systems (GIS) in order to properly locate the agents in the geographic space (Emrich et al., 2007). Besides, to be useful for practical decision-making, a system simulating an epidemic should provide a user with the ability to specify various scenarios in the context of a “what-if” analysis (Haddad & Moulin, 2008 ) in order to explore, for instance, the influence of climate changes and of various intervention strategies. Hence, there is a need for a simulation approach capable to model: 1) the various actors involved in an epidemic; 2) their locations in space based on accurate GIS data; 3) their interactions in space and time. Moreover, such simulations need also to deal with large (or very large) populations of various species (including humans in certain cases) and their biological cycles. The multi-agent geosimulation approach (see section 2.2) can be used to address these needs (Moulin et al., 2003; Benenson & Torrens, 2004; Hu et al., 2008). However, this approach has some limitations since it does not integrate the different levels of granularity to which the phenomenon can be observed by policymakers. Indeed, a multi-level system can help us broaden our spectrum of understanding of a complex phenomenon. Besides, new properties of this phenomenon can appear by changing the level of granularity, especially if data are available to do so. Moreover, the selection and specification of these levels of granularity influence the results of the simulation. In this context, we recommend using a multi-level geosimulation approach to remedy the shortcomings of current methods. We acquired some experience with the development of a public health management tool in order to simulate in a plausible way the behaviours and interactions of populations of indicator birds and of mosquitoes involved in the propagation and transmission of the WNV. Our approach takes into account the characteristics of the geographic environment and enables the user to explore various climatic scenarios and regimens of larvicide treatments. We are currently exploring avenues to produce a generic solution which can thus be applied to other zoonoses such as Lyme disease. To this end, we are doing a reengineering of our tool and approach in order to produce more realistic simulations at different levels of granularity. We present in the next section an overview of complex systems and the various approaches used to model and simulate such systems. In Section 3 we present an overview of the spread of infectious diseases and the particular approaches used to model and simulate zoonosis propagation. In Section 4 we present the multi-level geosimulation approach that we propose. In Section 5 we explain how such an approach has been used to take into account the peculiarities of the animal populations involved in the WNV propagation. In Section 6 we present our current work including the reengineering of our system and how we plan to develop a generic solution which might be applied to other zoonoses such as Lyme disease. We conclude this chapter with some recommendations.
2. Modelling and Simulation of Complex Systems
In this section, we present an overview of complex systems in order to understand and characterize them. We also try to explain the concept of complexity which is a source of controversy among scientists. We subsequently present the main approaches and methods used to model and simulate this kind of systems. The figures that we present in this section and the next sections are our proposals of synthetic views of phenomena of interest.
2.1. Overview of Complex Systems
The real world offers a large variety of complex systems ranging from the infinitely small to the infinitely large. In addition, the expansion of new technologies and the emergence of intelligent machines encourage us to make more sophisticated systems (Axelrod & Cohen, 1999). Therefore, the study of complex systems has become a large discipline. However, there is no clear definition of complex systems, since authors do not fully agree on the notion of complexity. They seem to have the same opinion when it comes to the difference between a complex and complicated system. Indeed, it is not because we do not understand the processes or factors that are involved in a system that it is necessarily complex. It may be simply complicated by the degree of understanding of the observer or of the user of the system. Etymologically speaking, the word "complicated" (from the Latin cum pliare, stack with) means that it takes time and talent to understand the object of study while the word "complex" (from Latin cum plexus, tied with) means that there are many intricacies, that "everything is connected" and that we cannot study a small part of the system in isolation. Thus, complex systems are usually complicated, but the opposite is not necessarily true. Some authors describe a complex system by the following three properties: (1) if it has many components, (2) whether his behaviour is not immediately foreseeable, and (3) if it emerges some self-organized properties (Murray, 1995). Thus, a complex system has several characteristics. Perhaps among the important ones are the self-organization and the emergence of coherent structures, such as the appearance of certain motifs at a higher level (Parrott, 2002). To better understand such phenomena, we can mention the example of a bowl filled with rice and raisins. If the container is shaken, we can easily notice that the raisins will gather together to form a group over the rice. Thus, this group emerged as a result of the interactions of the system components.
2.2. Approaches to Model and Simulate Complex Systems
Designers use modelling and computer simulation to virtually reproduce one or several phenomena in a computer for analysis purposes. A real phenomenon can be represented by one or several complex systems, since it can be modelled according to several points of view and according to the vision of the observer of the real world. It can also be modelled using different techniques and/or different approaches (Figure 1). Besides, modelling a complex phenomenon is the first step before simulating it in a computer in order to understand it and analyze it. Indeed, computer simulation can be defined as a technique used to mimic the behaviour of a system. This process consists of three main interrelated steps. The first step is the design or selection of models that can represent the studied phenomenon. This includes identifying and collecting data that will be used to feed the system to simulate. Some data are obtained using specific sensors or through human collection. Other data are obtained by interviewing experts of the domain or by applying knowledge acquisition techniques. The design of models is therefore based on these data and on knowledge gained from previous experiences with similar systems. The second step is the implementation of these models in a computer. Finally, the last step is the analysis of the results from the simulation of these models. Tests are done on the data generated by the above mentioned models, using for example statistical analysis. The most basic analysis would be to just observe the data and derive conclusions (Fishwick, 1995).
Besides, several approaches have been proposed to simulate complex phenomena. Among the approaches to simulate nonlinear continuous systems, we can mention the mathematical models and system dynamics. The simulation based on mathematical models is schematically carried out following 5 steps. (1) We start by defining the physical problem to be simulated. (2) We then describe this problem using a system of differential equations and set of boundary conditions which are properly chosen. (3) We replace the differential equations by algebraic equations. The numerical resolution of these equations can provide solutions that adequately describe the physical reality of the system. (4) We solve the algebraic equations using numerical algorithms chosen according to their calculation efficiency. (5) Finally, we test the numerical model in order to confirm that the selected algorithms converge towards a satisfactory solution (Farge, 1988). On the other hand, System Dynamics is an approach which deals with internal feedback loops, stocks, flows and time delays that affect the behaviour of the entire system. In order to use this approach, we have to begin by identifying all the elements of the problem that can be represented as system variables. This is the step of causal analysis which aims obtaining a simple qualitative model representing the system by some feedback loops. Then, we have to identify which among the system variables, are variables that appear to be accumulating. These are the state variables, also called "levels" by reference to the level of liquid in a container. We also have to identify flows that empty or fill the variable level. In addition, we have to identify the variables that influence these flows, which are typically information or decision variables. We then go through a stage of formalization and quantification using differential equations that can represent the system dynamics as continuous change. Finally, we have to validate and calibrate the model (Kirkwood, 1998).
Moreover, cellular automata are considered as a standard approach to study complex systems. Indeed, a cellular automaton represents a grid of "cells" that can each take a "state" among a finite set. The state of a cell ci at time t + 1 depends on the state at time t of a finite number of cells called the "neighborhood" of ci. The advantage of cellular automaton compared to the above mentioned approaches (mathematical models and system dynamics) is to add a spatial component to the simulation. However, there are two limits to the use of cellular automata. Indeed, the grid is usually artificial (not related to the studied phenomenon). This drawback has been circumvented by the implementation of cellular automata using irregular grid such as the Voronoi diagram (Shi & Pang, 2000). The second limit is that cellular automata can not manage individuals and their mobility in the geographic environment. This seems to be an important constraint when considering social phenomena in which individuals’ mobility needs to be simulated. Traditional agent-based approach tries to solve this problem by simulating the individuals as agents. Thus, the advantage of multi-agent systems compared to cellular automata is to explicitly take into account the trajectories of each individual or group of individuals in a virtual geographic environment (VGE). In such approach, agents are able to navigate and explore in the VGE. This is because the spatial behaviour of agents is not constrained by a grid of cells (Badariotti & Weber, 2002).
Besides, multi-agent geosimulation (MAGS) provides a new kind of simulation based on a combination of various techniques and theories (cellular automata, multi-agent system, etc.) and might offer a unique perspective that is lacking in traditional simulations. Moreover, MAGS approach emerged in response to deficiencies of the traditional multi-agent systems. In fact, multi-agent geosimulation has the advantage of structuring the spatial knowledge of the environment using data provided by geographic information systems. In addition, MAGS extends the scope of traditional simulations that aim to predict results from a set of hypotheses by allowing the user to specify various scenarios, assess and compare their outcomes. Thus, multi-agent geosimulation becomes a tool for decision support (Moulin et al., 2003; Benenson & Torrens, 2004). Let us mention here the MAGS platform which has been developed by our research group (Moulin et al., 2003). It can simulate the interactions of thousands of software agents interacting in virtual geographic environment. The agents have spatial and cognitive abilities such as perception, memory and navigation. Although one of the first applications of MAGS was the simulation of crowd behaviours in urban environments, it is a generic platform allowing the simulation of several types of behaviours in geo-referenced virtual environments. It has been used for example to simulate the behaviour of consumers visiting a shopping center, road traffic or the propagation of forest fires (Sahli et al., 2004; Moulin, 2008). Besides, MAGS system is composed of several modules performing various tasks, including a module used to simulate particle systems (Reeves, 1983). This module was added as part of a former work (Bouden, 2004) to simulate irregular shapes such as smoke or gas spreading through the simulation environment. Although particle systems were initially used to simulate tear gas for crowd simulation, their scope has been greatly expanded in MAGS system. Indeed, particle systems can be used to simulate animal behaviours such as flight of birds, moving herds and fish schools (Reynolds, 1987). It is precisely one of the reasons that led us to use MAGS system, and its particle system to simulate the behaviour of birds involved in the transmission of the WNV (see Section 5.3).
However, the majority of current simulation approaches, such as those presented in this section, ignores the multi-level aspect of complex phenomena and thus, in many cases, is not able to capture some important aspects of these phenomena. Indeed, multi-level systems can simulate phenomena at different levels of granularity. Each level provides a different degree of precision to model, simulate and analyse the phenomenon (An et al., 2005). Depending on the expectations of their users, a phenomenon can be studied at three different levels of detail (Macro, Meso and Micro): (1) Macroscopic models represent the phenomenon with the highest abstraction level. They are often used to assess the global dynamics of aggregates of individuals located on a vast territory. (2) Mesoscopic models are used to simulate groups of individuals through aggregated behaviours. (3) Microscopic models are used to simulate individuals characterized by relatively detailed behaviours and have a capacity of interaction with and perception of the environment. Moreover, the level of granularity may be related to different spatial scales when simulating a phenomenon at a macro, a meso or a micro level. It can also be linked to changes in the time scale. For example, people’s change of residence occurs over years, while the variation of traffic on a road section occurs every minute (Jakovljevic & Basch, 2004). Multi-level systems and their different levels of granularity can be modeled using the holonic approach. The term "Holon" comes from the Greek word "Holos" meaning "together" and the suffix "-on" which means "part". Indeed, a holon can be thought of as a fractal structure that is stable, coherent, and consists of several holons as sub-structures; each holon being part of another larger holon. Most agent-based systems consider agents interactions from a micro level perspective. However, a group or a population of agents, at a certain level of abstraction, can behave as separate entities. Many approaches have attempted to model the concept of agents composed of agents as collective agents or meta-agents. However, considering such agents as a holon is a promising approach that has continued to evolve, especially to model complex systems (Rodriguez et al., 2006).
3. Spread of Infectious Diseases
In this section, we provide an overview of phenomena related to the spread of infectious diseases, by trying to define and characterize them. We also present two types of diseases that will be used for illustrative purposes. Then, we present the main approaches and tools that are currently used to simulate the spread of these diseases.
3.1. Overview of Infectious Diseases
Infectious diseases are the leading cause of death on the planet, especially after their proliferation due to global warming. Indeed, millions of people die each year worldwide as a result of an infection. The list of such diseases that punctuate the history of men's health is so long. Some diseases recently resurfaced with the proliferation of international trade and travel or with the increase of the resistance to antibiotics. Other newly appeared with the emergence of infectious agents previously unknown. Major diseases such as AIDS, tuberculosis, malaria and measles continue to weigh heavily on economies and societies in the world, especially in developing countries. For several of these diseases, there is still no drug, vaccine or other effective treatment. As already mentioned in the introduction, infectious diseases result from the transmission of a micro-organism representing the pathogen (bacterium, virus, fungus or parasite) from an infected individual (host: human or animal) to a healthy individual. We are particularly interested in zoonotic diseases that can be transmitted from animals to humans via a vector which is precisely responsible for the spread of the disease. Such a vector is most often an arthropod (e.g. insect, tick, etc.).
As an example of a zoonosis of interest, WNV is a flavivirus which was isolated for the first time in 1937. Its name comes from the district of West Nile in Uganda. It was detected in human, birds and mosquitoes in Egypt in the early fifties, and has then been found in various European countries. WNV was detected on the American continent in 1999 and more specifically in New York (Nash et al., 2001). In Canada, WNV reached southern Ontario in 2001, while the first human cases were detected in August 2002. WNV made its appearance in Quebec in July 2002 (Gosselin et al., 2005). There are mainly two populations involved in the transmission of the WNV: the population of mosquitoes (Culex sp.) and the population of birds (we mainly consider the Corvidae family and more specifically crows which have been chosen by public health authorities as indicator birds for the WNV). The transmission of the WNV occurs mainly when mosquitoes bite birds. An infected mosquito can infect a bird, which can in turn infect healthy mosquitoes that will subsequently bite the infected bird before its death (Bouden et al., 2008).
Another example of a zoonosis of interest is the Lyme disease which is a borreliosis caused by a bacterium (Borrelia burgdorferi) that is carried and transmitted to human by ticks (Ixodes scapularis). The first description of this disease has been made in the United States in 1977 in the town of Lyme, Connecticut. Ticks generally live in wooded areas or tall grass. Small rodents and certain types of birds (especially migratory species) are considered as the natural reservoirs of the bacterium. Moreover, White-tailed deer (Odocoileus virginianus) is the most common host for the adult stage of ticks (Ogden et al., 2005; Ogden et al., 2008).
3.2. Approaches and Tools to Simulate Zoonosis Propagation
We have already presented in Section 2.2 the basics of the main approaches that are used to model and simulate complex systems. We present in Figure 2 our synthetic view of the approaches that are currently used to model and simulate the spread of infectious diseases. Indeed, mathematical models are frequently used to study the propagation of zoonoses. We can mention three kinds of mathematical models: (1) Compartment models which are the basis of mathematical modelling in epidemiology. For example, the two-compartment model (SI) considers only the susceptible and infected individuals. This is the simplest model, but there are other more complex models involving several parameters such as the SIS, SEI, SEIS, SEIR and SEIRS model. The compartment "E" represents exposed individuals which are not contagious because the pathogen needs an incubation period. However, the compartment "R" represents recovered individuals which, in some instances, develop some immunity to the infection (Noël, 2007). (2) Patchy models attempt to simulate "patterns of spatial dispersion" of disease spread. These patterns reflect the presence of the disease in areas not necessarily contiguous in space. The patches are characterized in most cases, by administrative regions, whose number and boundaries are selected, based on availability of data to feed models (Liu et al., 2006). (3) Metapopulation models, in epidemiology, represent graphs in which each vertex is associated with systems of differential equations. Vertices are also called "patches". Indeed, a patch is a unit within which the population is considered homogeneous. Such a patch may also represent a geographic location (e.g. city, region, country, etc.). The patches may or may not overlap. They can be contiguous or separated in space. They are normally connected by the movement of species between patches using arcs connecting the vertices of graphs. Therefore, each vertex (or each patch) contains a number of sub-populations of species (Arino, 2009).
As alternative approaches, cellular automata are often used to simulate the spread of infectious diseases (Fu & Milne, 2003; Beauchemin et al., 2005; Liu et al., 2006b). Let us mention, for example, the recent work of White et al. (2009) who used a two-dimensional cellular automaton to simulate the spread of a generic infectious disease. Although promising, this work lacks a careful calibration of the models (While et al., 2009). Moreover, other studies have simulated the spread of infectious diseases using traditional multi-agent systems (Emrich et al., 2007; Deng et al., 2008; Bauer et al., 2009). Let us mention, for example, the work of Liu et al. (2008) who proposed an agent-based model to simulate the spatio-temporal transmission process of an epidemic. These authors used four groups of agents: (1) suceptible agents, (2) exposed agents, (3) infected agents and (4) recovered agents (Liu et al., 2008). However, this kind of simulation can not be used to simulate a population with a large number of individuals.
Furthermore, authors who developed mathematical models in order to simulate the spread of infectious diseases typically use tools such as Stella, Powersim, Vensim or AnyLogic. They often use a systems dynamics approach (presented in Section 2.2) in order to represent and simulate their models. For example, Odgen et al. (2005) used Stella in order to model the influence of temperature on the evolution of tick populations which are responsible of the spread of Lyme disease. Alternatively, other authors develop new tools or new components for existing tools such as STEM (Spatiotemporal Epidemiological Modeler: www.eclipse.org/stem) or SELES (Spatially Explicit Landscape Event Simulator: www.seles.info). Although these tools are interesting, they still present some constraints since their use requires technical skills and their execution is usually very slow. Moreover, these tools can not be used to model phenomena at different levels of granularity. The generic approach that we present in the next section aims at offering the possibility to remedy shortcomings of current methods and tools.
4. Presentation of our Approach: Multi-Level Geosimulation
In this section, we present an overview of our multi-level geosimulation approach. Indeed, we propose a multi-model approach which can simulate the propagation of infectious disease at different levels of granularity. This new approach aims at overcoming the drawbacks of existing methods when used alone and benefiting from their advantages when used together. We also present the model that we propose to simulate the spatio-temporal interactions of actors of various types, including those representing populations containing a large number of individuals.
4.1. Overview of our Approach
Before presenting our approach, we would like to explain the usefulness of a tool for decision support since it is one of the main goals of our work. Indeed, we showed in Section 3.2 (Figure 2) that policymakers observe the spread of infectious disease, using monitoring system, before they can decide how, when and where to act in order to intervene on the spread phenomenon. However, it is not easy to make informed decisions in order to establish a strategic, tactical or operational plan, if decision makers only rely on the observation of the phenomenon. Thus, there is a need for tools for decision support which are able to simulate the phenomenon under various alternative scenarios of intervention. Using such tools, decision makers may specify different scenarios and carry out simulations in order to understand the phenomenon and analyze the simulation results (Figure 3).
In Figure 4, we present an overview of various simulation approaches that might be used to develop a support tool allowing decision makers to create action plans at different levels of abstraction (strategic, tactical and/or operational). Indeed, the simulation based on mathematical models can only give results at a macro level. It may help to establish guidelines for actions at the strategic level (e.g. political decisions). As examples of such mathematical models, we have already mentioned compartment models, patchy models and metapopulation models (Section 3.2). These two last types of models use an aggregated space that is not based on GIS data. On the other hand, the multi-level geosimulation uses a geo-referenced virtual geographical environment generated from GIS data. Moreover, it is characterized by several aspects (e.g. using scenarios, analyzing the simulation results, using mathematical models and data to feed them, etc.). In addition, given that this approach should produce simulations at different levels of granularity (e.g., Macro, Meso and/or Micro), it will not only help policymakers to establish guidelines for action at the strategic level, but also help tactical or operational decision makers to develop plans for intervention. Besides, surveillance systems can not make predictions of the probable spread of an infectious disease in order to initiate preventive action at the right time. However, these systems are essential to the multi-level geosimulation since they are used to calibrate the data feeding the mathematical models (Figure 4).
Moreover, we suggest that in a multi-level approach the choice of levels depends on three main factors: (1) the users’ needs in relation to their understanding of the phenomenon, (2) the availability of models representing the actors and their behaviours and (3) the availability of data to feed. Furthermore, these levels may vary with: (1) the spatial scales of the geo-referenced virtual geographic environment, (2) the temporal scales characterizing the steps of the simulation and the (3) different categories of actors (individuals, groups or populations) involved in the phenomenon. For example, the temporal scale can be used differently depending on the simulated disease. Indeed, in the case of WNV, using a day or at most a week for the simulation steps appears satisfactory for the needs of public health decision makers. Culex that are involved in the spread of WNV have a relatively rapid life cycle (few weeks). However, the case of Lyme disease is different. The life cycle of ticks responsible for spreading the disease is much longer (2.5 years on average). Hence, a simulation using for example a month as a simulation step should be considered so that decision makers can quickly grasp the evolution of the populations of ticks.
Thus, an infectious disease can be simulated using one or several levels of granularity. We present in Figure 5, different combinations of levels of granularity using three axes representing three dimensions: Spatial Scale (SS), Time Scale (TS) and level of granularity of actors' categories (LGAC). We know that the different levels (Macro, Meso and Micro) belonging to the same dimension are determined by the three main factors that have been already mentioned (the user's needs, availability of models and data feeding these models). Moreover, given a simulation carried out at a particular dimension (e.g. Spatial Scale at Macro level), we can use different levels of another dimension (e.g. Temporal Scale at Macro, Meso and Micro level). For example, the spread of disease can be simulated in a large area such as the province of Quebec using multiple time scales representing different simulation steps (months, weeks and days). Besides, in addition to the choice of the levels of granularity, we have to choose models that are used to represent these different levels. Indeed, different models can provide different levels of abstraction. For example, the spread of a zoonosis can be simulated using a model showing only the propagation flows of the infection. This spread can also be simulated using a model providing more details and thus can generate simulations at a finer level of abstraction (Figure 5).
4.2. Multi-Actor Spatio-Temporal Interaction Model (MASTIM)
In Figure 6, we present a new theoretical model (called MASTIM: Multi-Actor Spatio-Temporal Interaction Model) to simulate the interactions of various types of actors, including those representing populations containing a large number of individuals. Indeed, the large number of individuals of some populations involved in the spread of infectious diseases is a major modelling problem. Existing approaches such as traditional agent-based systems are not able to simulate this kind of populations. Given the limitations of computational resources of computers and the lack of data, we can not represent each individual by an agent, especially if we have to simulate a population composed of millions or even billions of individuals. This is the case of the mosquitoes populations involved in the transmission of the WNV. In this context, we propose our MASTIM model which can be used to simulate huge populations.
Besides, we use a qualitative classification to distinguish different types of populations according to their characteristics (quantity and mobility of individuals). Indeed, we distinguish the following two kinds of populations. We first consider populations with a large number of individuals for which it is often unnecessary and generally impossible to represent individuals or even groups of individuals. Thus, we propose to model this kind of population by associating it with what we called an "occupied area". Indeed, the population is linked to the density of individuals located in this area. For example, mosquitoes do not travel much and they are present almost everywhere in the territory. Hence, mosquitoes can be considered as a feature of the simulation environment. Moreover, we distinguish two types of populations among those containing a large number of individuals. Slow moving population with a large number of individuals (SMP-LNI) such as mosquitoes and fast moving population with a large number of individuals (FMP-LNI) such as locusts. Therefore, we propose to model a SMP-LNI by associating it with what we called a "static occupied area", and model a FMP-LNI by associating it with what we called a "dynamic occupied area". A dynamic occupied area can be modelled by the spread of a gas cloud.
We also consider populations with a small number of individuals which can be modelled by decomposing it into groups of individuals or even into individuals, and depending on the desired level of granularity. We also distinguish two types of populations among those containing a small number of individuals. Fast moving population with a small number of individuals (FMP-SNI) such as deers (relatively when compared to ticks) and slow moving population with a small number of individuals (SMP-SNI) such as rodents. Therefore, we propose to model a FMP-SNI by associating it with what we called a "deployment area" in order to represent the mobility of individuals or groups belonging to such population. Moreover, we propose to model a SMP-SNI by associating it with either a deployment area or a static occupied area, depending on the desired level of granularity.
Besides, the different possibilities of interactions between populations, groups and individuals are illustrated in Figure 6. Indeed, interactions are modelled by combining elementary interactions. Moreover and according to the desired level of granularity, the different categories of actors (individuals groups, populations) can be represented by agents. We will then have n agents in the simulation. Each of these agents may be either a source or a target of an elementary interaction. Hence, an interaction occurs between a source entity and a target entity. An entity can be an agent or a group of agents. It may also be a particle representing a group of individuals or a group of particles. In addition, an elementary interaction can modify the source, target or both entities. The action which represents the interactions between the target and the source can be carried out through what we called a "vector of interactions" (Figure 6) which transfers the effect of the action from the source entity to the target entity. This transmission can be done in a discrete manner (e.g. a pathogen vector) or in continuous manner (e.g. energy flow). The MATSIM model is an original contribution of our work since classical models of interactions, such as the influences and reaction model of Ferber & Müller (1996), are only able to model interactions between agents. Such models cannot be used to simulate large population.
5. A Method and Tool for the Geosimulation of Large Populations
Using the MAGS platform, we developed the WNV-MAGS system to simulate the interactions between large populations of mosquitoes and birds which are involved in the spread of the WNV. To do so, we applied an 'Agile' (Ambler, 2002) analysis and design method which favours the collaboration with domain specialists and users, as well as quick adaptations of the software under development. But before developing WNV-MAGS, we had to collect data from various heterogeneous sources in order to create plausible populations of mosquitoes and birds. Thus, we present in this section how we collected such data. Then, we briefly present the mathematical model which is used by WNV-MAGS. We also present the results of the geosimulation of the WNV propagation and the calibration of the system.
5.1. Collecting Information and Data Preparation
We applied classical knowledge engineering techniques (Plant et al., 2002) in order to acquire domain knowledge from the specialized literature and from domain experts (entomologists and ornithologists) and after many work sessions. We then went through an exploration phase of the field by collecting all available information in order to understand the phenomena which are related to the spread of WNV. However, given the enormous complexity involved in representing such phenomena and the lack of detailed data, we had to raise a number of reasonable simplifying hypotheses with regard to the species of interest, the factors influencing the evolution of the populations, the geographical region selected for the analysis, the period of simulation and the space-time scale. Then, we designed a conceptual model representing a synthetic view of the phenomena of interest while taking into account the above mentioned simplifying hypotheses. For example, we considered only Culex (pipiens/restuans) and crows as the main two populations of mosquitoes and corvidae birds involved in the transmission of the WNV. Another useful simplification was about the displacements of crows. Indeed, we only considered the period of the year when crows regroup in roosts in order to spend the night in large gathering (Caccamise et al., 1997). This social behaviour takes place during the July to September season when the mosquitoes are most active, numerous, and susceptible of transmitting the WNV. Taking advantage of our conceptual model, we designed the system architecture, which helped us to implement the WNV-MAGS tool.
In order to create the virtual geographic environment representing the area of interest, we used GIS data to generate the various spatial data layers needed by the system. Moreover, we used the Geomedia GIS software to handle the geo-referenced data of the DMTI Spatial (CanMap Streetfiles), the digital maps of Institut national de santé publique du Québec (INSPQ), and the census shapefiles. Using this data, we created the bitmap from which the MAGS platform generates the simulation environment. This bitmap contains polygons representing either municipalities or census tracts, depending on the area of interest: municipalities are used to cover large areas such as the southern part of the province of Quebec whereas census tracks are used to characterize smaller areas such as Ottawa metropolitan area. In addition, we had to pre-process all the data needed to create the two populations (Culex and crows) involved in the WNV spread. We first estimated the initial number of individuals of each population at the beginning of the simulation which starts at the end of June (Figure 7). For the Culex population, we estimated the number of adults that emerge from the larvae laid down in sumps (which we supposed to be the main reservoirs of mosquitoes in urban and sub-urban areas). To this end, we developed a Visual Basic application in order to query the geo-referenced databases in Geomedia and to compute the total length of roads for each polygon (municipality or census tract). We then computed the number of sumps in each polygon by using the total length of roads. Considering the population of crows, we used the SAS statistical software and the MapInfo GIS to compute a specific density of birds per region (number of individuals by square kilometer). This was done by estimating an average of the sighting mentions provided by professional and amateur ornithologists using the ÉPOQ database (Étude des populations d'oiseaux du Québec: www.oiseauxqc.org/epoq.jsp) for the southern part of the province of Quebec and the ebird database (www.ebird.org) for the Ottawa metropolitan area. After the data preparation, we implemented the WNV-MAGS system using the MAGS platform which is developed in C++.
5.2. Using a Compartment Mathematical Model
We used a compartment mathematical model (Wonham et al., 2004) in order to compute the dynamics of the two populations. This model is based on 8 differential equations which can compute over time the evolution of the different types of individuals: susceptible, infected, recovered and dead birds, the larvae of mosquitoes and the susceptible, exposed and infected adult mosquitoes. However, we proposed some modifications in order to correct some discrepancies that we found in the model. We also included in the model the climate effects. This was a difficult task because when considering the effects of temperature variations, the model in not in equilibrium anymore. Hence, we had to modify the differential equations (Noël, 2007). The adjusted model gives satisfactory results in terms of quality (e.g. distribution of the mosquitoes generations). Indeed, the pace of the established curves reflects the biological behaviours of the studied species according to domain experts. However, the quantitative results provided by our initial simulations of the evolution of the mosquitoes and crows populations (e.g. the number of larvae, eggs, emerged Culex, dead crows, etc.) were not completely satisfactory. We corrected this problem by calibrating the system (see Section 5.4).
5.3. Geosimulation of Large Populations
We had to model the two populations (Culex and crows) involved in the transmission of the WNV as well as their interactions in the virtual geographic environment (VGE). Indeed, the population of Culex represents an extremely large number of individuals and cannot be represented using individual agents (this kind of population corresponds to SMP-LNI in the MASTIM model: Section 4.2). In fact, we decided to model the mosquito population using what we called an "intelligent density map" which is characterized by population data being attached to reference areas (static occupied areas in the MASTIM model) in the VGE. This intelligent density map is a kind of cellular automaton in which a tessellation of irregular cells (municipalities or census tracts obtained from GIS data) is associated with rules that enable the system to simulate the evolution of the different categories or compartments of mosquitoes (adults, larvae, healthy, infected, etc.) using the compartment mathematical model. When considering the population of crows (this kind of population is a FMP-SNI in the MASTIM model), we used agents to model groups of crows associated with a spatial base location (which represents a deployment area in the MASTIM model) corresponding to roosts that have been observed in the field.
In our model, a roost is considered as the spatial base location of a group of crows. Such groups which may be numerous (several thousand) gather in the roost during the summer season. During the day, crows disperse around the roost in search of food, returning at night. Hence, the spatial phenomenon of gathering and dispersion of this subpopulation of crows can be represented in a synthetic way in the form of an expansion and a contraction of the area occupied by this sub-population, varying by roost size (hence, we can model the variable density of crows in this dynamically changing area). In WNV-MAGS, each roost agent is implemented as a particle system which simulates the way crows spread around a roost during the day. Besides, a particle represents one or several crows, depending on the number of individuals attached to the roost. Each particle has different characteristics (velocity, movement direction) that enable it to travel at a distance from the roost location during a number of simulation steps representing a day. Hence, the set of particles associated with a given roost covers a circular area with a maximal radius set by the operating range parameter. Besides, the interactions of the mosquitoes and crows populations have also been modeled thanks to the geosimulation which enables the system to automatically determine the places and times where groups of crows (pertaining to roosts) will cross areas in which the Culex sub-populations are located. Therefore, the system can estimate the number of new infected individuals, based on the likelihood that a number of individual crows be bitten by Culex and be infected with WNV. To this end, certain equations of the compartment model are applied at each simulation step, for each particle crossing a cell of the intelligent density map where mosquitoes are located.
Moreover, the user can visualize the extent of the spread of WNV on the map of the study area in different ways. The system can either change the color of the particles representing the infected crows or the color of the polygon representing a municipality or a census tract containing a high density of infected Culex (Figure 8). Besides, the WNV-MAGS System offers a variety of functionalities to the user in order to modify the parameters of the mathematical model, to visualize the progress of the infection in and around the crows' roosts, to extract data from the simulation and to generate graphs showing the evolution of the involved populations.
5.4. Applying Different Scenarios and Calibration of the Simulations
In our system, the multi-agent geosimulation is at the heart of a decision support tool. Hence, our approach is somewhat different from more traditional simulations used for prediction purposes (Benenson & Torrens, 2004). Since WNV is particularly sensitive to environmental changes (El Adlouni et al., 2007), our tool allows a user to explore various climate scenarios (temperature and precipitation variations) in addition to public heath intervention scenarios (larvicide treatments). The assessment and comparison of different simulation scenarios can help decision makers to make more informed decisions.
Currently, a user may choose one among five different scenarios which influence the dynamics of the Culex population. The first scenario is the default scenario which can be set in order to use average conditions of temperature and precipitations (using in this case the Canadian Climate Normals). The second type of scenario allows the user to choose a date during which abundant rains may flush sumps in some municipalities or census tracts. Sumps offer ideal locations for the maturation of larvae and the emergence of adult mosquitoes. They are also the main targets where public health authorities may request specialized private companies to spray larvicides. Moreover, abundant rains may flush sumps, killing a large proportion of larvae. In the same way, the third scenario is used to simulate the application of larvicides in certain areas (municipality or census tract). The fourth scenario is a combination of the second and third scenarios. Hence, it is possible to choose a date for the flushing of sumps and another date for the application of larvicides. Most larvae are supposed to die after the flushing of a sump, although the dynamics of the larval populations starts all over again since there are always Culex adults in the vicinity of the sump that will spawn new eggs. The last scenario allows multiple applications of larvicides (Figure 9).
Besides, we carried out calibrations of the models by using monitoring data (capture of Culex, collection of dead crows and application of larvicides on the ground) provided by various public health agencies for the southern part of the province of Quebec and the Ottawa metropolitan area. Indeed, we compared simulation results and field observations. For example, we evaluated the ratio between the real populations of Culex and the samples of Culex captured in traps (a captured mosquito was considered to represent a population of 300 Culex over one km2 (Reisen et al., 1991; Reisen et al., 1992) and as well as between crows and the collected dead crows. Concerning the southern part of the province of Quebec, we chose some key municipalities where human infections had occurred. It appeared thereafter that there was a significant difference between the data generated by the model and those obtained from the field. Hence, we tuned up the initial settings of the simulation (e.g. the initial percentage of infected Culex or infected crows, number of emerged Culex per sump, percentage of sumps containing larvae, etc.) as well as some parameters of the mathematical model (e.g. mosquitoes biting rate of crows per capita, WNV transmission probabilities from Culex to crows or from crows to Culex, etc.). This approximate sensitivity analysis was done by changing only some parameters in order to observe their effects on the results of the simulation. Indeed, we have been careful when choosing certain parameters that should not be changed, especially those who have an impact on the biological cycle of the studied species (e.g. birth rate, maturation rate, mortality rate, etc.).
The variation of parameters, that can be changed, helped us to quantitatively calibrate the model for the processed municipalities. Thus, the parameters adjusted for calibration provide reasonable results (Bouden et al., 2008). Concerning the Ottawa metropolitan area, since we lacked data, we chose only one census tract (number: 5050011.04) in order to validate the model using the parameters already adjusted for the southern part of the province of Quebec. This tract contains the largest roost of crows (situated in the General Hospital Campus) and four trap stations of Culex. The modelling results show again a good fit with observed data (Figure 10).
Besides, we have to mention that we used the parameters calibrated for the southern part of the province of Quebec in order to simulate the propagation of WNV in the Ottawa metropolitan area because the two regions are similar in terms of ecology and climate. Moreover, the simulations carried out at two different spatial scales highlighted similar problems related to the calibration process. Indeed, whatever the chosen scale, we were not able to calibrate the entire geographic area of interest. We had to select some key municipalities for the large-scale simulation (southern part of the province of Quebec) and only one census tract for the small-scale simulation (Ottawa metropolitan area). The lack of data is the explanation of the limits of the calibration process. This led us to propose some recommendations in Section 7.
6. Reengineering and Adapting the System to Other Zoonoses
Our MAGS approach and geosimulation tool can be used not only to simulate the propagation of the WNV, but it can also be adapted to various other vector-borne diseases. We are currently exploring avenues to produce a generic solution which can thus be applied to other zoonoses such as Lyme disease. To this end, we are doing a reengineering of our tool in order to produce more realistic simulations at different levels of granularity.
Besides, we have completed the conceptual architecture of our new system (called Zoonosis-MAGS) which allows for a multi-level geosimulations of different types of infectious diseases. This architecture is based on the IPSO (Input/Process/Store/Output) modelling method (Moulin, 1985) which can represent all the needed system components and their relationships. While constructing this architecture, we identified all the processes to be developed (represented as green rectangles with regular contour and numbered as Pi in Figure 11) and all the data stores (represented as blue rectangles with oval contour and numbered as Ai in Figure 11) that gather data and feed the system processes. Moreover, the IPSO method is based on a refinement approach which consists on representing first the overall process of the system that we have to develop. Then, we can detail each of the sub-processes belonging to this overall process. This hierarchical decomposition allows us to progressively detail the system in order to reach the required precision with the possibility of a feedback refinement.
In Figure 11, we present the overall process of the architecture of our new tool (represented as a large rectangle with thick lines). Indeed, most of the necessary data are obtained from external databases (represented as cylinders at the bottom of Figure 11) such as the ÉPOQ and the weather databases. The other sources of data are the GIS and the monitoring system (represented as rectangles with a shadow). On the other hand, the sub-processes P7 to P11 deal with data preparation, including the extraction of data from all the required databases. These sub-processes produce the data stores A06 to A09 which feed the internal database of the Zoonosis-MAGS system. This database is used by the sub-processes P2 to P4 in order to generate additional data stores.
For example, the sub-process P2 uses environmental data which is produced using GIS (A09), weather data including temperature and precipitation (A08), zoonosis data such as the parameters of the selected mathematical model (A03) and data characterizing actors involved in the zoonosis propagation and their behaviours (A02). Thus, P2 produces the data store A01 containing the different scenarios that the user wants to apply to the geosimulation. Besides, the data stores A01 to A04 are the input of the most important sub-process of our system. It is the sub-process P1 which has the main task of geosimulating the zoonosis propagation using our new approach. For example, we can find in this sub-process the geosimulation engine which uses, among other things, our MASTIM model. Moreover, the sub-process P1 will of course produce results (A05) which will be analyzed by the sub-process P5. The analyzed data (A010) may be used by the sub-process P6 to calibrate the different models. This sub-process uses the data produced by the monitoring system (A011) and the data containing in the Zoonosis-MAGS database (A06 to A09).
Besides, we are currently developing our new system. Indeed, we follow an iterative analyses and development approach which is in line with recent methods of complex system engineering (Kuras, 2007).
7. Conclusion and Recommendations
The solution that we propose (using a multi-level geosimulation approach) can bring significant contributions to the advancement of knowledge especially for risk management. Indeed, our strategy to manage the risk of an infection outbreak triggered by a virus or bacteria is to help health policy makers to better understand a complex phenomenon such as the spread of a zoonosis and therefore to be able to make informed decisions. These decisions can initiate preventive actions at the right time and places in order to avoid or reduce the negative effect of the risk which is in our case the propagation of an infectious disease. However, the lack of data to feed the simulation models and the quality of data that are available to calibrate these models are among the main limits of the simulation of a complex system such as the spread of an infectious disease. Indeed, it is sometimes impossible to find such data in the literature or even to get if from experts. As a result, making assumptions is unavoidable in order to address the problem of missing data. Some of these assumptions may reduce the realism of the simulation. To solve this problem, it is important that additional field studies be carried out by domain experts (as for examples entomologists and ornithologists). Moreover, the data needed to calibrate simulation models and obtained from monitoring systems have some bias. For example, the collection or analysis of data regarding the infected animals is not carried out on a regular basis and usually in only a small subset of geographic areas. Sampling is often determined inconsistently in time and space. Therefore, the very high variability of the data collected by monitoring systems often causes a lack of adequate data to feed the simulations. The problem lies in the fact that designers of monitoring systems do not always think about the usefulness of data collected in order to adequately interpret them. To solve this problem, it is important to enhance monitoring systems and the way field analyses are carried out. For simulation purposes, we suggest to focus on certain areas (e.g. some municipalities, some census tracts) instead of trying to monitor everything on a large territory. Therefore, rather then getting huge data sets which are for the most part unusable, we would obtain more complete and consistent data sets from samples, well distributed in time and space. The data collection should also be carried out repetitively in the same areas in order to insure a better quality, and continuity of the data.