Spatial Dynamic Modelling of Deforestation in the Amazon

Cellular automata make up a class of completely discrete dynamical systems, which have became a core subject in the sciences of complexity due to their conceptual simplicity, easiness of implementation for computer simulation, and their ability to exhibit a wide variety of amazingly complex behavior. The feature of simplicity behind complexity of cellular automata has attracted the researchers' attention from a wide range of divergent fields of study of science, which extend from the exact disciplines of mathematical physics up to the social ones, and beyond. Numerous complex systems containing many discrete elements with local interactions have been and are being conveniently modelled as cellular automata. In this book, the versatility of cellular automata as models for a wide diversity of complex systems is underlined through the study of a number of outstanding problems using these innovative techniques for modelling and simulation.


Fig. 1. Location of study area: São Félix do Xingu municipality and surroundings
Although this region has cattle raising as the leading economic activity, ore exploitation (mainly cassiterite and gold) has had an important role in its economy and regional spatial structure since the 1970s. Initially, ore was transported along the Xingu river and its tributaries, and also through small aircrafts (Santana, 2007). From the first half of the 1990s onwards, mining activities entered into decline, leaving behind them a dense network of roads connecting farms in the region, which considerably reduced the importance of the air and fluvial transport (Amaral et al., 2006;.

Input data for the spatial dynamic model of deforestation
A platform for the spatio-temporal modelling of landscape dynamics -Dinamica EGO -was used to generate the simulations of deforestation in the analysed study area. Dinamica EGO consists in a cellular automata environment that embodies transition algorithms operating commonly through a Moore neighbourhood (3 x 3 window) as well as spatial feedback approaches in a stochastic multi-step simulation framework. Dinamica EGO is a free domain platform and was developed by the Centre for Remote Sensing of the Federal University of Minas Gerais -CSR-UFMG 1 (Soares-Filho et al., 2002, 2009Rodrigues et al., 2007). Real data on deforestation from 1997 to 2000 drove the simulation model, helping to assess the total amount of forest conversion into other land cover classes in the study area for the given period of analysis. A set of explaining variables related to deforestation together with internal parameters of Dinamica EGO were jointly combined to generate a simulation for the year 2000. In the following sections, methodological procedures employed at each stage of the modelling process (data acquisition, variables selection, exploratory analysis, calculation of transition probabilities, parameterisation, and accuracy assessment) will be dealt with in a thorough manner.

Deforestation data
The original deforestation map, containing the land cover classes forest, grasslands, rivers, deforested areas until 1997, and deforested areas from 1997 to 2000, was acquired from the digital PRODES (Brazilian Deforestation Assessment Project) database (INPE, 2006) and is shown in Fig. 2. This map and other input data were pre-processed in the software IDRISI Kilimanjaro (Eastman, 2003). This PRODES deforestation map was reclassified, so as to generate the land cover maps for the initial time of simulation − year 1997 (Fig. 3), and for the final time of simulation − year 2000 (Fig. 4). Considering that PRODES methodology does not take into account deforestation over natural grasslands, this work similarly restricted itself to simulate deforestation only over forested areas. Due to generalisation procedures adopted in the PRODES maps, only the Xingu river and its major tributary, Fresco river, are visible, since minor streams were disregarded in face of the spatial resolution adopted in the maps (120 x 120 m).

Explaining variables
The proper choice of a set of explaining variables is critical for the success of a spatial dynamic model. The forested cells suitability to undergo deforestation precisely depends on the relations between such variables and the response variable (deforestation). In this experiment, the analysed variables were selected based upon similar previous studies (Alves, 2002;Laurance et al., 2002;Aguiar et al., 2007;Soares-Filho et al., 2006;Pereira et al., 2007;Brandão Júnior et al., 2007), which report the prevailing variables in deforestation processes. The factors which influence deforestation are manifold. However, the difficulties in data acquisition limit the input data actually employed in the modelling process. Six biophysical variables were evaluated: i) distance to paved roads; ii) distance to non-paved roads; iii) distance to urban centres; iv) distance to rivers; and v) distance to previously deforested areas in 1997. By means of a visual analysis of the original deforestation map in 1997, it was possible to observe that both the distance to roads and to urban centres have a limited influence on deforestation processes. In 1997, the occupation in the study area was already wellestablished, reducing the availability of land for deforestation close to roads and urban settlements. Although the fluvial transport plays a minor role in the region, as previously exposed, it was observed the mushrooming of deforestation patches in the vicinities of the Xingu and Fresco rivers during the analysed period, which although occurred to a reduced www.intechopen.com extent in farther areas, justified the inclusion of the variable 'distance to rivers' in the model. In CA models, two types of variables can be used: i) the static ones, which are kept constant throughout the model run, and ii) the dynamic ones, that suffer changes throughout the successive time steps, which are then continuously updated at each iteration. Both of them were built upon basis of the PRODES reclassified map in the year 1997 (Fig. 3). The static variable corresponds to the map of distance to rivers (Fig. 5), and the map of distance to previously deforested areas in 1997 (Fig. 6) entered the model as a dynamic variable. In order to categorise the grids of distances, i.e. generate optimal discrete ranges of distances, special automatic routines available in the Dinamica EGO were used, which are based on algorithms of lines generalisation (Agterberg & Bonham-Carter., 1990;Goodacre et al., 1993).

Exploratory analysis
The statistical method 'weights of evidence' was employed for the parameterisation of this modelling experiment. Such method is entirely based on Bayes´s theorem, also known as the conditional probability theorem, which assumes the independence of events. In this sense, the eventual existence of spatial dependence (or spatial association) between pairs of explaining variables must be initially verified. For this end, two statistical indices were used: the Cramer´s Coefficient (V) and the Joint Information Uncertainty (U) (Bonham-Carter, 1994). Both of them are based on the ratio of overlapping areas among the different categories (in this case, ranges of distance) belonging to two maps of explaining variables (or evidences). The Cramer´s Coefficient operates with absolute values of area, while the Joint Information Uncertainty deals with relative (percentage) values, and hence, tends to be more robust than the former index, for it avoids the risk of bias represented by absolute area values. Bonham-Carter (1994) reports that values less than 0.5 either for V or U suggest less association rather than more. In this way, the threshold of 0.5 was adopted to decide whether a variable should remain in the model (V or U < 0.5) or be excluded from it (V or U ≥ 0.5). In this work, the Cramer's Coefficient and the Join Information Uncertainty presented low values (V = 0.21 and U = 0.0485), indicating that both variables could be simultaneously employed in the model.

Global transition rates
The global transition rates refer to the total amount of change in the simulation period, regardless of its spatial distribution, i.e. without taking into account spatial peculiarities at the local level, which are those related to biophysical and infrastructural characteristics of each cell in the study area. In this modelling experiment and in other experiments where the initial and final land cover maps are available, the transition rates were calculated by a cross-tabulation operation between the land cover maps from 1997 and 2000, producing as output a transition matrix with land cover change rates observed during this period.

Local transition probabilities
The local transition probabilities, different from the global transition rates, are calculated for each cell considering the natural and anthropic characteristics of the site. For estimating the land cover transition probabilities in each cell, represented by its coordinates x and y, an equation converting the logit formula into a conventional conditional probability was used. The logit corresponds to the natural logarithm of odds, which consists in the ratio of the probability of occurring a given land cover transition to its complementary probability, i.e. the probability of not occurring the transition. This concept derives from the Bayesian weights of evidence method, from which the land cover transition probability can be obtained through algebraic manipulations of the logit formula, as follows (Bonham-Carter, 1994): where P corresponds to the probability of transition in a cell; i corresponds to a notation of cells positioning in the study area, defined in terms of x,y coordinates; α represents a type of land cover transition, e.g. from a class c to a class k, within a total of η transitions (in this particular experiment, there is only one type: from forest to deforested areas); V i 1 corresponds to the first variable observed in cell i, used to explain transition α; V i mα corresponds to the m-th variable observed in cell i, used to explain transition α; O (T i α ) represents the odds of transition T α in the i-th cell, expressed by the ratio of the probability of occurrence of T i α over its complementary probability, i.e., P (T i α )/ P (T i α ); and W + i,v corresponds to the positive weight of evidence for the i-th cell regarding the v-th variable range, defined as: The W + values represent the attraction between a determined landscape transition (in this case, from forest to deforested areas) and a certain variable range. The higher the W + value is, the greater is the probability of a certain transition to take place. On the other hand, negative W + values indicate lower probability of a determined transition in the presence of the respective variable range. Using the W + values concerning the several distances ranges of the static and dynamic variables employed in the analysis of deforestation, the Dinamica EGO model calculates the cells transition probabilities according to Equation (1). The grid cells are assigned a value of probability and a probability map is then generated. In order to evaluate if the model is well calibrated, i.e. if the employed explaining variables are appropriate and if the categorisation of the numerical grids is optimal, this map must present the area with the highest transition probability values as close as possible to the areas that actually underwent deforestation processes.

Defining the dinamica EGO internal parameters
Dinamica EGO incorporates two empirical land cover allocation algorithms (or transition functions) called: expander and patcher. The expander function accounts for the expansion of previous patches of a certain land cover class. The patcher function, on its turn, is designed to generate new independent patches (seed cells), which are not physically connected with previous patches of the same land cover class (Soares-Filho et al., 2002). In summary, the expander function performs transitions from a state i to a state j only in the adjacent vicinities of cells with state j. And the patcher function performs transitions from a state i to a state j only in the adjacent vicinities of cells with state other than j. Based on a visual analysis of the PRODES deforestation map (Fig. 2) during the analysed period (1997)(1998)(1999)(2000), it was not observed the occurrence of deforestation by means of diffusion processes, i.e. amidst the virgin forest, what is emulated by the patcher function. In this way, only the expander function was employed in the model, which simulates the formation of deforestation patches through the expansion of previously deforested areas, as exposed above.
The average size and variance of size (in hectares) of the new deforestation patches are also required as internal parameters by Dinamica EGO. The definition of these parameters is done heuristically. The ideal average size (μ) was set to 300 ha, and the variance (σ 2 ) to 500 ha. The model contains another heuristically determined parameter, which is the so-called 'patch isometry index (PII)'. This index represents a numerical value ranging from 0 to 2, which is multiplied by the probability values of the eight cells belonging to the Moore neighbourhood, before the application of the transition function. A high isometry index results in compact patches, while low values are reflected in more fragmented landscape patterns. In this modelling experiment, an isometry index of 1.5 was adopted, what represents a balance between compactness and fragmentation. This value generates results in compliance with the deforestation pattern observed in the study area, which presents a mixture of geometrically stable (compact) deforestation patches produced by capitalised farmers that use tractors for the forest clear-cut, as well as fragmented patches generated by small farmers, deprived of sophisticated means for the forest removal.

Validation
For assessing the accuracy of the CA simulation model performance, fuzzy similarity measures applied within a neighbourhood context were used. Several validation methods operating on a pixel vicinity basis have been proposed (Costanza, 1989;Pontius, 2002;Hagen, 2002Hagen-Zanker et al. 2005;Hagen-Zanker, 2006), aimed at depicting the spatial patterns similarity between a simulated and a reference map, so as to relax the strictness of the pixel-by-pixel agreement. The fuzzy similarity method employed in this work is a variation of the fuzzy similarity metrics developed by Hagen (2002), and has been implemented in the DINAMICA model by the CSR team. Hagen´s method is based on the concept of fuzziness of location, in which the representation of a cell is influenced by the cell itself and, to a lesser extent, by the cells in its neighbourhood. Not considering fuzziness of category, the fuzzy neighbourhood vector can represent the fuzziness of location. In the fuzzy similarity validation method, a crisp vector is associated to each cell in the map. This vector has as many positions as map categories (land cover classes), assuming 1 for a category = i, and 0 for categories other than i. Thus, the fuzzy neighbourhood vector (V nbhood ) for each cell is given as: www.intechopen.com μ nbhood i = | μ nbhood i,1 * m 1 , μ crisp i,2 * m 2 , …, μ crisp i,N * m N | Max (4) where μ nbhood i represents the membership for category i within a neighbourhood of N cells (usually N=n 2 ); μ crisp ij is the membership of category i for neighbouring cell j, assuming, as in a crisp vector, 1 for i and 0 for categories other than i (i ⊂ C); m j is the distance-based membership of neighbouring cell j, where m accounts for a distance decay function, for instance, an exponential decay (m = 2 -d/2 ), with d being the unitary distance measured in between two cells centroids). The selection of the most appropriate decay function and the size of the window depend on the vagueness of the data and the spatial error tolerance threshold . As it is intended to assess the model spatial fit at different resolutions, besides the exponential decay, a constant function equal to 1 inside the neighbourhood window and to 0 outside can also be applied. Equation (5) sets the category membership for the central cell, assuming the highest contribution is found within a neighbourhood window n x n. Next, a similarity measure for a pair of maps can be obtained through a cell-by-cell fuzzy set intersection between their fuzzy and crisp vectors: where V A and V B refer to the fuzzy neighbourhood vectors for maps A and B, and μ A,i and μ B,i are their neighbourhood memberships for categories i ⊂ C in maps A and B, as in Equation (6). According to , since the similarity measure S (V A ,V B ) tends to overestimate the spatial fit, the two-way similarity is instead applied: The overall similarity of a pair of maps can be calculated by averaging the two-way similarity values for all map cells. However, when comparing a simulated map to the reference map (real land cover in the final time of simulation), this calculation carries out an inertial similarity between them due to their areas that did not suffer any change. To avoid this problem, the CSR team introduced a modification into the overall two-way similarity method of Dinamica EGO, using two maps of differences, which present value 1 for the cells that underwent change, and 0 for those that did not change. In this way, each type of change is analysed separately using pair-wise comparisons involving maps of differences: (i) between the initial land cover map and a simulated one, and (ii) between the same initial land cover map and the reference one. This modification is able to tackle two matters. First, as it deals with only one type of change at a time, the overall two-way similarity measure can be applied to the entire map, regardless of the different number of cells per category. Second, the inherited similitude between the initial and simulated maps can be eliminated from this comparison by simply ignoring the null cells from the overall count. However, there are two ways of performing this function. One consists of counting only two-way similarity values for non-null cells in the first map of difference, and the other consists in doing the opposite. As a result, three measures of overall similarity are obtained, with the third representing the average of the two ways of counting. As random pattern maps tend to score higher due to chance depending on the manner in which the null cells are counted, it is advisable to pay close attention to the minimum overall similarity value. This method has proven to be the most comprehensive when compared to the aforementioned methods, as it yields fitness measures with the highest contrast for a series of synthetic patterns that depart from a perfect fit to a totally random pattern (Soares-Filho et al., 2009). According to what was previously explained in Section 4.3, the calculation of the local transition probabilities, i.e. the probabilities of land cover change for each cell, is based on the values of the positive weights of evidence (W + ). Tables 2 and 3 present the values of W + for each distance range of the dynamic variable 'distance to previously deforested areas in 1997' and to the static variable 'distance to rivers', respectively. Fig. 7a and 7b graphically present the behaviour of the W + values in relation to the successive distance ranges of these two explaining variables. The curve of W + for the variable 'distance to previously deforested areas in 1997' (Fig. 7a) reveals the concentration of weights with the greatest values in the first distance ranges, what demonstrates the predominance of new deforestation patches in the surroundings of pioneer areas (Alves, 2002;Aguiar et al., 2007). Said in other words, the pattern of deforestation expansion in the study area during the analysed period mostly presents patches of large extensions, following a trend to take place in the vicinities of previously deforested areas, also in face of the reported landed property concentration in this region.  Regarding the variable 'distance to rivers' (Fig. 7b), it is possible to observe, however, that the distance ranges closest to rivers present the lowest W + values. This can be explained by the fact that, although deforestation processes to a reduced extent occur nearby rivers, these new deforestation patches account for a very limited share of the total deforested area in this region, what causes a decrease in the weights values referring to such distance ranges. As ranges are located ever further from rivers, their weights gradually start to assume increasing positive values. These farthest ranges actually correspond to the very bordering areas of well-established occupations, which are exactly those prone to experience deforestation processes. In brief, this variable acted as a fine tuning device for the variable 'distance to previously deforested areas in 1997'. As previously stated in Section 4.3, Dinamica EGO generates maps of local probabilities based on the calculated W + values, assigning to each map cell a value of transition probability. Fig. 8 presents the map of local land cover change probabilities (from forest to deforested areas), generated in this experiment. The areas with the highest probability values pretty much correspond to the areas where deforestation indeed occurred, as it can be observed in the PRODES deforestation map (Fig. 2). The simulated deforestation map produced by the model (Fig. 9), considering both the static and the dynamic variables, presented high fuzzy similarity indices for multiple spatial resolutions, i.e. for multiple sampling window sizes (Table 4), in the case of the constant decay, what denotes the good performance of the model.

Positive Weights of Evidence -Distance to Deforested Areas in 1997
Comparing the simulated land cover map with the real land cover map in 2000 (Fig. 4), it is possible to notice that very regularly-shaped patches generated by real deforestation processes undertaken by capitalized farmers could not be reproduced in the simulation. This can be ascribed to the fact that the current transition functions available in Dinamica EGO still cannot cope with extremely rigid requirements regarding the patches geometry. Nevertheless, the spatial pattern of deforestation patches in the simulated map is very similar to the one found in the real scene, indicating the CA simulation model efficacy.
It is worth mentioning that the aim of modelling is not to reproduce reality as close as possible, but solely to detect main spatial patterns and trends of land cover change. Spatial patterns refer to morphological aspects in the patches formation, i.e. whether they are geometrically stable or irregular, whether they are originated by expansion or diffusion processes, and so on, whilst trends concern the directions (or vectors) of deforestation propagation in space. Since the fuzzy similarity index (FSI) is a flexible method for CA model validation, in the sense that it does not operate on a pixel basis but rather on multiple levels of resolution, the values of this index in the cases where the constant decay is adopted tend to be slightly superior when compared to indices based on strict agreement, which are those derived from a direct pixel-by-pixel comparison between the real and the simulated scene. The agreement increases with the size of the sampling window until it reaches a resolution of around 11 x 11 or 13 x 13 pixels, when the similarity index stabilises (Fig. 10), what shows that the FSI is inefficient to assess the model fitness with rough spatial resolutions. It is worth reminding that the use of agreement indices based on multiple spatial resolutions to assess the performance of CA models can be justified by the fact that it is unfeasible not only to reproduce the configurations of changes in the past, but also and mainly to foresee future www.intechopen.com land cover transitions with a very fine spatial accuracy, given the intrinsic randomness of land cover change processes.

Conclusion
This chapter presented an experiment on spatial dynamic modelling based on cellular automata, designed to simulate deforestation in the region of São Félix do Xingu, southeast of Pará State, north of Brazil, during the period from 1997 to 2000. The employed explaining variables (evidences) do not exhaustively represent the set of deforestation drivers in the area, but they correspond however to strategic inducing factors of such type of land cover change.
The dynamic variable 'distance to previously deforested areas in 1997' was decisive for simulating deforestation processes observed in the vicinities of forest clear-cut areas. In face of the historically reported landed property concentration in this region, the deforestation pattern in the analysed period mostly presents patches of large extensions, which are predominantly found in the surroundings of pioneer areas. The static variable 'distance to rivers', on the contrary, revealed the limited importance of the fluvial transport for deforestation and human occupation in this region. Nevertheless, considering that deforestation still takes place nearby rivers, this variable acted as an alternative for the fine tuning of the variable 'distance to previously deforested areas in 1997'. The obtained result demonstrates the suitability of the employed CA model for simulating deforestation processes in the study area for the time span from 1997 to 2000, what was confirmed by the high values of the fuzzy similarity index. The Dinamica EGO platform proved to be appropriate for achieving the goals of this research experiment, also in view of one of its transition allocation algorithms -the expander function -which is able to reproduce the spatial pattern of deforestation expansion in the vicinities of previously occupied areas, as it is the case in São Félix do Xingu. Further advantages of Dinamica EGO concern its flexible structure that accepts different parameterization methods (Almeida et al, 2008) and varied sets of static and dynamic variables, which, when associated in a thoughtful manner, may meet the modelling requirements of the most diverse study areas, with peculiar land cover (or land use) change patterns. These properties allow the transferability of this and other models developed in this platform to other areas in the Amazon (Soares-Filho et al., 2009;Maeda et al., 2010), owning distinct occupation histories, at different consolidation stages, and involving specific local actors, with very particular characteristics. As directions for future research, the authors intend to explore a wider scope of explaining variables, like indicators of the land ownership legal status, local unpaved roads as well as vectors of ongoing occupation fronts, besides operating with further stochastic parameterization methods. The authors as well intend to deal with longer temporal series, so as to use a more in-depth knowledge on the area occupation history, and hence, provide differentiated future simulation scenarios for São Félix do Xingu in the short-and mediumterm. These forecasts would be based on plausible political, socio-economic, and infrastructural arrangements at the local and regional level. Although models in general have continuously been the target of severe criticism, mainly in view of their reductionism and constraints to fully capture the reality inherent complexity (Briassoulis, 2000), it can be argued that they ought to exist, for they offer an incomparable way of abstracting patterns, order and main dynamic trends of real world processes (Batty, 1976).
Actually, urban models should be conceived, handled, applied and interpreted in a wise and critical way, so that modelers, practitioners, public and private decision-makers as well as citizens as a whole could take the best of what they can offer and sensibly acknowledge their limits (Almeida, 2003;Almeida et al., 2005). Spatial dynamic models, of which CA is one of the best representatives, are still the most promising means for rendering land cover change simulations communicable and transparent to politicians, planners, decision-makers as well as to the lay public in general.

Acknowledgement
This work has been accomplished as part of the agenda of the Research Network for Environmental Modelling in the Amazon -GEOMA (http://www.geoma.lncc.br/), established through cooperation agreements among institutions subordinated to the Brazilian Ministry of Science and Technology (MCT). Cellular automata make up a class of completely discrete dynamical systems, which have became a core subject in the sciences of complexity due to their conceptual simplicity, easiness of implementation for computer simulation, and their ability to exhibit a wide variety of amazingly complex behavior. The feature of simplicity behind complexity of cellular automata has attracted the researchers' attention from a wide range of divergent fields of study of science, which extend from the exact disciplines of mathematical physics up to the social ones, and beyond. Numerous complex systems containing many discrete elements with local interactions have been and are being conveniently modelled as cellular automata. In this book, the versatility of cellular automata as models for a wide diversity of complex systems is underlined through the study of a number of outstanding problems using these innovative techniques for modelling and simulation.