Output distribution statistics of the uncertainty analysis of FCUGM’s scenarios using Monte Carlo simulation (MCS).

## Abstract

Although cellular automata (CA) offer a modelling framework and set of techniques for modelling the dynamic processes of urban growth, determining the optimal value of weights or parameters for elements or factors of urban CA models is challenging. This chapter demonstrates the implementation of a calibration module in a fuzzy cellular urban growth model (FCUGM) for optimizing the weights and parameters of an urban CA model using three types of algorithms: (i) genetic algorithm (GA), (ii) parallel simulated annealing (PSA) and (iii) expert knowledge (EK). It was found that the GA followed by EK produced better and more accurate and consistent results compared with PSA. This suggests that the GA was able to some extent to understand the urban growth process and the underlying relationship between input factors in a way similar to human experts. It also suggests that the two algorithms (GA and EK) have similar agreement about the efficiency of scenarios in terms of modelling urban growth. In contrast, the results of the PSA do not show results corresponding to those of the GA or EK. This suggests that the complexity of the urban process is beyond the algorithm’s capability or could be due to being trapped in local optima. With this satisfactory calibration of the FCUGM for the urban growth of Riyadh city in Saudi Arabia by using CALIB-FCUGM, these calibrated parameters can be passed into the SIM-FCUGM to simulate the spatial patterns of urban growth of Riyadh.

### Keywords

- cellular automata
- urban growth
- calibration
- genetic algorithm
- parallel simulated annealing
- Riyadh

## 1. Introduction

Linear, static, top-down, descriptive and explanatory models cannot adequately help to explain and reflect the essence of urban phenomena. With deeper understanding of urban phenomena, scientists have begun to recognize that cities are not uniform or a single type of phenomenon but more typically hierarchies of complex systems. As complexity theory and its properties have developed over the last three decades based on studies of non-linear systems, fractals, bifurcations, self-organization and chaos theory, cities have gradually become regarded as spatially complex systems [1–3]. A city can be characterized as a non-linear, open, complex, self-organizing and emergent system, which is far from being in equilibrium [1, 4, 5]. Urban growth dynamics are the direct consequence of the actions of individuals, public and private corporations (local agents) acting simultaneously over urban space and time. Therefore, cities are the spatial result over time of all these influences, which continuously contribute to shaping a city (aggregate global form). Cellular automata (CA) offer a modelling framework and set of techniques for modelling the dynamic processes and outcomes of such self-organizing systems [6]. CA techniques provide a way of simulating a self-organization process over geographical space and time [6, 7] and demonstrate significant potential benefits for urban modelling from the late 1980s due to their simplicity, flexibility and transparency [8–17]. However, Wu [18] argued that calibration of urban CA models is challenging when one seeks to determine the optimal value of weights or parameters for elements or factors of a model. If one can find optimal values, the results from running the model are likely to be greatly improved. With this in mind, the authors designed, implemented and evaluated a prototype for calibrating a stochastic, high-dimensional (up to 95) and non-linear urban CA model.

A fuzzy cellular automata model of urban growth was presented in Ref. [19]. Al-Ahmadi et al. presented an urban planning tool for the city of Riyadh, Saudi Arabia, which is one of the world’s major cities undergoing rapid development. At the core of the system is a fuzzy cellular urban growth model (FCUGM), which is capable of simulating and predicting the complexities of urban growth. This model was shown to be capable of replicating the trends and characteristics of an urban environment during three periods: 1987–1997, 1997–2005 and 1987–2005. In another paper [20], the model was used to study and evaluate several different planning scenarios, both baseline ones and scenarios that relate to actual Saudi government policy. The results demonstrated that the model was capable of predicting plausible patterns of future urban growth. The model also has wider implications for use as a spatial planning support tool for urban planners and decision-makers in Saudi Arabia. A description of the application of fuzzy logic in the calibration of the FCUGM was presented in Ref. [21]. Along with calibration, one of the most significant aspects of any model is to verify, validate and assess its performance. The focus of the work published by Al-Ahmadi et al. [22] was on the techniques used to validate the performance of the FCUGM. They presented seven different validation metrics including visual inspection, accuracy and spatial statistics, metrics for spatial pattern and district structure detection as well as spatial multi-resolution validation.

The aim of this chapter is to describe the implementation of a calibration module in the FCUGM for optimizing the parameters for different modes and scenarios of the FCUGM using three types of algorithms: (i) genetic algorithm (GA), (ii) parallel simulated annealing (PSA) and (iii) expert knowledge (EK). These were applied over three periods [urban growth boundary (UGB)] including UGB I (1987–1997), UGB II (1997–2005) and UGB I + II (1987–2005). The FCUGM is a hybrid CA model for research in urban planning and urban growth. It aims to explore and explain the complex spatial patterns of urban growth and to support the spatial urban planning through its two modules namely CALIB-FCUGM, the calibration model, and SIM-FCUGM, the simulation model, which can be used for prediction. Although the FCUGM is based upon fuzziness, it is designed to use stochastically constrained CA models.

## 2. Study area: geographic situation, physical environment and urbanization process

The Kingdom of Saudi Arabia is situated at the furthermost part of south-western Asia and occupies approximately four-fifths of the Arab Peninsula, covering a total area of 2.25 million km^{2} of which about 40% are desert lands, and a population of 22,673,538 million according to 2004 census. The city of Riyadh is situated on the Najd Plateau in the central region of the Arabian Peninsula and surrounded to the east by high land ridges and to the west by the convergence of valleys forming Wadi Hanifah and Mount Tuwaiq. Riyadh is one of the fastest growing cities in the Middle East. The annual rate of population growth in Riyadh has reached an average of 8.1% by natural increase and immigration, and according to recent forecasts, the population is expected to increase to 10 million by 2020. In parallel with this dramatic increase in population, the spatial extent of Riyadh has grown from less than 1 km^{2} in 1920 to over 1150 km^{2} in 2004.

The urbanization process of Riyadh during the period between 1750 and 2004 has passed through four main phases of development namely the pioneer phase, the pre-establishment phase, the establishment phase and the oil-boom and post-oil boom phase. Broadly, the increase in wealth, building of the railway, the inauguration of the airport and transferring government agencies from Jeddah to Riyadh and the need to build new ministries and hundreds of houses has had a significant impact on the urban growth of Riyadh. This high rate of growth in population and areas has not been met with an adequate expansion of services, management capacity and development intervention. As a result, several types of problems have manifested, for example, the spread of slums and squatter settlements, a shortage of services for large parts of the city and a growth in demand for housing accompanied by land and transportation difficulties. After examining the main three Master Plans of Riyadh, the results indicate that most of the criticisms of the first and second Master Plans were based on the fact that they did not adequately anticipate the size of urban growth, which took place in Riyadh; this was because much of the development occurred beyond the boundaries designated by the plan. This resulted in unexpected urban sprawl. Another weakness aspect of these two Master Plans was that they were formulated on the basis of moderate economic growth rate. Consequently, they could not have anticipated the economic effect of oil boom in the 1970s and its adverse effect on the city’s physical growth in terms of density and scale. This suggested the need for a tool to generate different scenarios of urban growth and test the potential physical and environmental impact for each scenario. Planning authorities, urban planners and decision-makers in Saudi Arabia have recently, however, begun to use spatial analytical and other planning tools to simulate and evaluate the consequences of urban planning policies prior to implementing them. Such tools can help to explore plans, policies and other factors underpinning and influencing processes of urban growth in the recent past, which can in turn lead to a better understanding the current factors influencing urban growth and ultimately in making more reliable predictions. Based on the use of software applications and tools, one can generate and evaluate the consequences of diverse future scenarios for urban growth by answering ‘what if’ type questions.

In this chapter, the term ‘urban growth’ refers to the physical transformation of vacant, dessert or agricultural land to urban land by planning and building infrastructure and industrial, residential, retail, educational and other buildings and social and recreational facilities.

## 3. Uncertainty and global sensitivity analysis of FCUGM

Although many studies [8–17] have investigated models of urban growth-based CA, little attention has been paid to examining the uncertainty and errors in urban CA models. It has been hypothesised that urban CA models are influenced by uncertainties that might be generated from various sources such as the complex interaction between input factors and parameters, specification and structure of the model and quality of input data [23]. The structure of CA models is not error-free; however, like other computer models, they are affected by errors owing to poor or partial human knowledge, complexity of the process being investigated and limitations of technology [23, 24]. The impact of neighbourhood size and type on model outcomes of a GIS-CA urban growth model was analysed by Kocabas and Dragicevic [24]. They applied univariate sensitivity analysis to study the variations in model outcomes by changing one parameter at a time while other parameters were kept constant. They found that the size and type of neighbourhood parameters have a significant influence on CA model output. The use of such a technique is considered as local sensitivity analysis. It is, however, time-consuming and cumbersome if more than two parameters are allowed to vary simultaneously. It is also deterministic and static. It cannot mimic the non-linear, stochastic and dynamic features, which typically exist in urban models. The error propagation in urban CA simulation was examined by Yeh and Li [23] through using a Monte Carlo Simulation (MCE). When MCE is applied, the spatial variables are perturbed so that the sensitivities of perturbations in urban simulation can be assessed in terms of errors in the outcome of simulation.

The FCUGM models the spatial pattern of urban growth using three modes: Mode 1, Mode 2 and Mode 3. The three modes differ in the structure of the fuzzy IF-THEN rule because different structures of transition rules might generate different simulation outcomes. The FCUGM can simulate spatial patterns of urban growth under nine scenarios [21]. An uncertainty and global sensitivity analysis (UGSA) was undertaken on all of the nine scenarios in the three modes of the FCUGM in order to assess the effects of uncertainties in the input variable (independent factor) and on the output variable (dependent factor). The advantage of using global rather than local sensitivity analysis is that the former is dynamic, stochastic and apportions the output uncertainty to the uncertainty in all of input variables. It evaluates the effect of one input variable while all of the others are varied as well. In contrast, the local perturbative approach is based on partial derivatives. The effect of the variation in one input factor is evaluated when all of the others are kept constant at their central value [25]. In the FCUGM, UGSA can provide an initial estimation of the quality of each mode and scenario in terms of understating urban growth of Riyadh. Since each mode and scenario has a different specification and structure, UGSA will be applied to help in identifying the most appropriate one.

The MCS technique was selected to undertake the UGSA because it has been applied successfully in a variety of applications including financial risk and statistical physics [26]. In addition, the MCS is one of the frequently applied techniques for computer simulations or numerical experiments. In terms of urban CA models, Yeh and Li [27] claimed that MCS tended to be most appropriate for the investigation of error propagation in urban CA simulation, particularly when mathematical models are difficult to define. Moreover, applying MCS has advantages since urban CA models cannot be modelled explicitly based on mathematical equations. Although one of the main drawbacks of MCS is the computation time required to generate a large number of samples, yet recent advancements in computer technology have reduced this problem [23]. MCS is relatively simple and straightforward to apply. It is generally based on generating numerous evaluations (runs) of the model with randomly selected input values for variables. For each trial or run, the input variables are assigned to random values based on selected input distributions and the value of each output variable recorded [25]. The results of MCS are, however, only an approximation (not exact) of the true value [26].

Scenario | Mean | SD | Skewness | Kurtosis | 90% certainty value |
---|---|---|---|---|---|

Mode 1—Scenario 1 | 0.277 | 0.142 | 0.079 | 1.987 | 0.523 |

Mode 1—Scenario 2 | 0.505 | 0.15 | 0.154 | 2.053 | 0.656 |

Mode 1—Scenario 3 | 0.319 | 0.159 | −0.187 | 1.899 | 0.701 |

Mode 1—Scenario 4 | 0.458 | 0.189 | 0.343 | 2.859 | 0.507 |

Mode 2—Scenario 1 | 0.508 | 0.109 | −1.257 | 2.468 | 0.628 |

Mode 2—Scenario 2 | 0.324 | 0.164 | −0.062 | 1.891 | 0.518 |

Mode 2—Scenario 3 | 0.458 | 0.189 | 0.343 | 2.859 | 0.674 |

Mode 2—Scenario 4 | 0.187 | 0.132 | 0.754 | 2.753 | 0.423 |

Mode 3—Scenario 1 | 0.223 | 0.149 | 0.572 | 2.538 | 0.315 |

These were chosen to generate and evaluate different urban growth scenarios based on different planning objectives. In the context of the FCUGM, the independent variables are the parameter values of input variables while the dependent variable is the output mean square error (MSE) of the scenario. Thus, the UGSA will examine the effect of the variations in parameters values on the MSE outcome. There are no rules for selecting the ‘best’ number of iterations for performing UGSA primarily because it is problem-dependent. Sufficient iterations are essential, however, to determine statistically the relevant response distribution. Technically 1000 to 10,000 trials are usually good measures in terms of the number of trials [26]. The MCS was run 5000 times for each scenario. The uncertainty in the parameters of the input variable was represented by a uniform distribution with lower and upper bounds corresponding to each input variable. Each trial will be evaluated by calculating the MSE of the differences between the observed and simulated urban maps. Five distribution statistics were computed to assess the output variable (MSE) resulting from MCS for each scenario including: mean, standard deviation (SD), skewness, kurtosis and 90% certainty value (CV), as shown in Table 1. The skewness measures the extent to which the MSE values cluster to one side or the other of the mean. When most values and a higher number of occurrences cluster towards the left tail, this implies that they should provide a good solution. The kurtosis measures the sharpness of the distribution. A kurtosis greater than three indicates a high peak of occurrences, while less than three indicates a flat top [28]. The CV represents the value of the MSE that 90% of the outputs (trails) less than the returned CV. Thus, the lower the CV value, the better the scenario. Figure 1A–I shows the occurrences of MSE generated from each scenario; this indicates the empirical estimation of MSE for the random combinations of the input parameters.

As illustrated in Table 1, Mode 1—Scenario 4, Mode 2—Scenario 4 and Mode 3—Scenario 1 generated the best performance with the lowest certainty values of 0.507, 0.423 and 0.315, respectively. This means that 90% of the occurrences (iterations) have a MSE with 0.507, 0.423 and 0.315 for these three scenarios. In addition, as shown in Figure 1, these three scenarios present a similar pattern where most of the occurrences are clustered towards the left side, with a low MSE output and thus better performance. This is supported quantitatively by accounting for the higher skewness rates with 0.343, 0.754 and 0.572 for Mode 1—Scenario 4, Mode 2—Scenario 4 and Mode 3—Scenario 1, respectively. In contrast, Mode 1—Scenario 2, Mode 1—Scenario 3 and Mode 2—Scenario 1 show the highest certainty values of 0.656, 0.701 and 0.628, respectively. This indicates that 90% of the solutions are below a relatively high MSE range (0.65–0.701). Note that the structure of modes is based on the number of fuzzy variables embedded in each fuzzy rule and the structure of scenarios is founded on the number and type of urban growth factors, specifically the transportation support factor (TSF), urban agglomeration and attractiveness factor (UAAF) and topographical constraints factor (TCF) [21]. It can be inferred that the number of urban growth factors in each scenario has, to a large extent, considerable influence on the performance of the scenario. For example, the three scenarios that showed the best performance namely Mode 1—Scenario 4, Mode 2—Scenario 4 and Mode 3—Scenario 1, are the only scenarios among the total of nine that included the three urban growth factors TSF, UAAF and TCF.

In contrast, the remaining six scenarios embed only one or two. This suggests that the urban growth process in Riyadh can be modelled more accurately by integrating these three factors into a single scenario, rather than just using one or two of them. In addition, one can deduce that the higher the number of fuzzy variables embedded in each single fuzzy rule in the mode, the better the performance of that mode. Mode 3—Scenario 1, for example, embeds three fuzzy variables in each rule and accounts for the highest certainty value, that is, 90% of the 5000 evaluations produced a low MSE with less than 0.315, which indicates that such a mode structure is better than any other. When the number of fuzzy variables in the fuzzy rule decreases, the MSE decreases, for instance, Mode 2—Scenario 4 (two fuzzy variables with 0.423) and Mode 1—Scenario 4 (one fuzzy variable with 0.507). However, the high accuracy produced by Mode 3 involved a high computation time. One can see that as the fuzzy variables in the fuzzy rule increase, the simulation time increases exponentially. For example, the average computation time was 4.5, 8 and 19 hours for scenarios in Mode 1, Mode 2 and Mode 3, respectively.

With respect to the scenarios in Mode 1 (except for the best one, Scenario 4), it can be inferred that urban growth in Riyadh is influenced by transportation support (Scenario 1 with a CV of 0.523) more than socio-economic services (Scenario 2 with a CV of 0.656) and topographical constraint factors (Scenario 3 with a CV of 0.701). With regard to the scenarios in Mode 2 (with the exception of Scenario 4), it can be inferred that the process of urban expansion in Riyadh city is moderately affected by integrating the transportation support with socio-economic services (Scenario 2 with 0.518 as CV) more than by integrating transportation support with topographical constraint factors (scenario 1 with a CV of 0.628) or socio-economic services with topographical constraints factors (Scenario 3 with a CV of 0.674).

## 4. Calibration of the FCUGM

The calibration process of the FCUGM is undertaken by a module called the CALIB-FCUGM. This consists of several interlinked sub-models that are processed sequentially either once or several times during a calibration period. The CALIB-FCUGM aims to provide the SIM-FCUGM, the module by which the simulation is executed, with the optimal parameter values or weights of spatial variables to enable realistic generation of urban patterns. The CALIB-FCUGM optimizes parameters by three different algorithms, namely GA, PSA and EK.

### 4.1. Basic process flow of CALIB-FCUGM

The stages of the CALIB-FCUGM are illustrated in Figure 2. The main procedures of the CALIB-FCUGM fall into four stages: (i) Input Variables Weighter, Fuzzy Distance Decay Quantifier, Fuzzy Input Variables Integrator and Fuzzy Input Variables Normalizer (yellow boxes); (ii) Fuzzy model (green boxes); (iii) CA model (blue boxes); and (iv) Optimization Algorithms (grey box), as shown in Figure 2. Most boxes in Figure 2 are a sub-model of CALIB-FCUGM; it takes some outputs from the preceding sub-model and feeds the subsequent sub-model with some inputs. The dashed boxes indicate that this sub-model includes parameters, which require to be optimized. The calibration process works sequentially. It begins by reading input variables into the Input Variables Weighter, by which a weight is assigned to each input variable reflecting its corresponding importance to other variables. Next, the weighted input variables are passed into the Fuzzy Distance Decay Quantifier to compute the effect of the distance decay of each variable by optimizing the distance decay parameters. These weighted fuzzy variables are then fed into the Fuzzy Input Variables Integrator, which integrates these weighted fuzzy variables into three fuzzy driving forces [19, 21]. These in turn are normalized to between 1 and 100. The next stage involves passing these three fuzzy input variables into the fuzzy model and creating a ‘calibrated development suitability map’. After adding a stochastic disturbance factor into the development suitability map, this map is called the ‘calibrated development possibility map’. This calibrated development possibility map is then entered a conditional statement to decide whether a certain location can be considered as ‘urban’ or ‘non-urban’ based on both its development possibility and the calibrated transition threshold. The conditional statement outputs the final ‘urban calibrated map (UCM)’, which is a binary map (1 for urban and 0 for non-urban). Finally, the ‘urban calibrated map’ is read by the Evaluator, in order to assess the accuracy of this map, and compared with the ‘urban observed map (UOM)’ (which is also a binary map), by computing the error between the two maps by calculating the best net objective value (BNOV). All of this procedure is generated several times according to the characteristics of each of the three algorithms GA, PSA and KB. Note that the CALIB-FCUGM module works automatically after a user enters the input variables to the module. The outcome of the CALIB-FCUGM module is an optimal set of parameters and weights. This will be read into the SIM-FCUGM module to simulate urban development. As the FCUGM is a loosely coupled model, the output of the CALIB-FCUGM is read by the SIM-FCUGM by manual entry.

### 4.2. Feasible solution of the CALIB-FCUGM

As stated earlier, the main aim of the CALIB-FCUGM is to find the optimal set of weights and parameters for each scenario of the FCUGM. Each candidate solution provided by the CALIB-FCUGM is a set of weights or parameters, which vary according to their associated range (predefined upper and lower bounds). Table 2 shows the total number of weights and parameters, which are calibrated for each scenario. As shown, the number of weights and parameters for scenarios is different. This is due to the difference in the number of fuzzy variables employed in each scenario. This affects the number of input variables, number of weights, number of distance decay parameters and other parameters because all these parameters are used to build fuzzy variables.

Modes and scenarios | Number of weights and parameters |
---|---|

Mode 1—Scenario 1 | 57 |

Mode 1—Scenario 2 | 59 |

Mode 1—Scenario 3 | 57 |

Mode 1—Scenario 4 | 65 |

Mode 2—Scenario 1 | 63 |

Mode 2—Scenario 2 | 69 |

Mode 2—Scenario 3 | 69 |

Mode 2—Scenario 4 | 93 |

Mode 3—Scenario 1 | 99 |

### 4.3. Objective function of CALIB-FCUGM

The performance of the GA, PSA or EK algorithms is evaluated based on the quality of the final solution acquired by the algorithm. In relation to the quality of the final solution, the value of the objective function (cost function), which is also referred to as an fitness function in GA and energy function in PSA, is the major criterion for assessing performance of the algorithm. The effectiveness of any iterative algorithm such as GA or PSA depends heavily on having an efficient objective function. The purpose of the objective function is to determine for any given configuration of the search space a value that represents the relative accuracy of that configuration or solution. In the CALIB-FCUGM context, the robustness of the solution can be considered as an error and the objective function aims to minimize the error between the UOM and the UCM.

There are several techniques for measuring errors, which can be used in the FCUGM problem such as total absolute error (TAE), mean absolute error (MAE), MSE, root mean square error (RMSE), normalized root mean squared error (NRMSE), relative operating characteristic (ROC), confusion matrix (CM) and Kappa Index of Agreement (KIA). The measurement of differences in errors between the observed and simulated images has been performed in different ways by various authors. A CM was used by Wu and Webster [29] to evaluate the accuracy of the simulated image against the observed one. The MSE and the MAE were used by Li and Yeh [30, 31] for measuring errors between simulated and observed images in a study involving modelling urban developments. The MSE also used by Kim [32] for measuring the accuracy between the observed and probability images as a way of validating results from calibration process. The NRMSE was used by Heppenstall [33] as a fitness function to validate the calibration results of a GA and to measure the error between the observed and predicted spatial multi-agent model for petrol prices. In addition, Pontius and Schneider [34] applied and explained how to use the ROC technique to examine how well a probability map portrays the likely locations of a category of new development. The Leica ERDAS image processing application uses RMSE for measuring the error of image rectification and KIA for validating image classification results.

As a result, in the FCUGM, the authors selected two types of measures, one to verify the calibration results and the other for testing the simulating results. Although most of the techniques are appropriate for verifying the performance of simulation processes, few of them are suitable for doing this for calibration. This is because the calibration process in the FCUGM requires the candidate solution to be assessed in each iteration, while in the simulation process, the results are verified once at the end. Consequently, the MSE and RMSE were selected to validate the results of the CALIB-FCUGM for several reasons. First, they are the most well known and widely used techniques of error measurement [35]. Second, they are efficient for validating the performance in a cell-by-cell manner, which is the case in calibrating the FCUGM, and they will be calculated in this research as given below in Eqs. 1 and 2:

where OFI is the objective function (MSE or RMSE) of a location *ij*; *O*_{ij} is the urban observed state at location *ij*; *C*_{ij} is the urban calibrated state at location *ij*; and *n* is the number of locations or cells.

Although several research studies have only applied a straightforward objective or fitness function such as the MSE or RMSE as measure for error, little attention has been paid to measuring the effect of constraints. It has been claimed that GA and PSA are stochastic algorithms and have to be constrained to explore only the search space with desired values. The author argues, however, that it would be much better to compute the overall net objective value (NOV) as well, because such a measure includes a weighting system with objective functions and implemented constraints through penalty functions, which add to the overall objective value. The net objective value, therefore, is penalized as the set of design variables moves further out of bounds or does not meet a constant constraint value. The NOV can be computed as shown in Eq. 3:

where *OF*_{i} is the objective function (MSE or RMSE) for a solution *i*; *i*; *PF*_{i} is the penalty function for a solution *i*; and *i*.

It can be difficult, however, to compare NOV values from different experiments if the range and mean of the NOV are different in each case. Thus, to avoid this problem, the standardized net objective value (SNOV) will be used as shown in Eq. 4:

The penalty functions that will be used in the CALIB-FCUGM include two types of constraints: (i) equality (some calibrated parameter values have to be equal a constraint value) and (ii) inequality (some calibrated parameter values have to be less or greater than constraints). An example of the equality constraint is that the total calibrated weights should be equal to 100; if it is more or less than 100, the net objective value is penalized by adding this difference to the net objective value resulting in poorer solutions.

### 4.4. Experimental design of calibration process

In order to calibrate the FCUGM for acquiring the best set of parameters to generate a realistic simulation, several experiments were conducted. The experiments have eight aspects: (i) sample data set; (ii) calibration algorithms; (iii) mode; (iv) scenarios; (v) urban growth periods; (vi) training process; (vii) cross-validation process and (viii) calibration time. Figure 3 illustrates the process of the calibration experiments. The best sample size for calibration is specified. Then, this data set is divided equally into two parts, one called ‘training data set’ and the other ‘cross-validation data set’. The purpose of the former is to train the performance of the scenario of interest, while the latter aims to verify the calibrated parameters, which were generated by using the training data. The authors propose to calibrate the FCUGM urban growth process under three modes, which are comprised of nine scenarios as previously explained.

Each scenario is calibrated over three periods UGBI, UGBII and UGBI+II. The process starts by passing the training data sets into the CALIB-FCUGM, so the model is calibrated by three algorithms: GA, PSA and KB. Each scenario is calibrated five times for each period. Then, the parameters of the best solution are passed into the VALID-FCUGM, where the cross-validation data set exists, to verify the calibration results. The VALID-FCUGM is a static model, which validates the parameters as an off-line model. This process is conducted for each scenario over the three periods. Afterwards, the performance of the scenarios in terms of training and validation are evaluated and the best scenario in each mode is selected. Next, the mean of the optimal parameters for the best scenarios from applying the three algorithms is reported and passed into the SIM-FCUGM for simulation purposes.

### 4.5. Calibration data set

In terms of the calibration data set, it might be not appropriate to use the whole study area as a training data set because the volume of data is very large and could require very high levels of computational resources, which eventually affect the efficiency of the model. Moreover, spatial data are often not independent: the value of one observation is likely to be influenced by the value of another observation, so using the whole data set leads to the common problem of spatial autocorrelation (or spatial dependence) because values of variables at one location are more likely to be significantly associated with values at nearby locations. The high spatial dependency of variables is more likely to affect the accuracy of analysis and might lead to misinterpretation of the results. Random sampling is, however, a conventional way to overcome this problem [28, 31, 36].

The authors could not find any rules in the scientific literature about the ‘best’ type and size of random samples for calibrating urban models. An urban CA model was calibrated by Li and Yeh [30] using artificial neural networks by training the model using a proportional stratified random sampling method with a total of 3000 cells. The samples were proportionally randomly selected from different land use types, 50% (1500 points) being used as a training data set while the rest was used as test data set to verify the training results. In another study, Li and Yeh [31] calibrated the same model but with binary urban states (urban and non-urban) by applying the same spatial sampling method but with a total of 1000 samples, 50% for training and the remainder for validating training results. This suggests that the sample sizes reduce as the number of urban states decrease. There are many types of spatial sampling methods such as random, systematic, proportional random stratified, disproportional random stratified and clusters. It has been argued, however, that the stratified is better than the random sample, because the latter might supply redundant observations when sample locations are nearby to one another [36] and may exclude some smaller urban categories [31]. In any event, systematic sampling is not appropriate for the FCUGM problem because the urban and non-urban locations are randomly located and not systematically distributed. A random sampling method was used because it depends on the variable that is being investigated rather than the size of the variable area. In the case of the FCUGM, the urban state is the state by which the urban growth process and pattern are represented and measured. Thus, particular attention needs to be focused on the locations of the urban state rather than non-urban ones. In this sense, urban state locations need a more detailed monitoring or over-sampling, while maintaining adequate coverage of the non-urban portion of the sampled area. As a result, the proposed random sampling offers more intensity of samples for urban state locations with 60% of the total samples while 40% for the non-urban locations. With respect to the size of sample, Rogreson [36] claimed that the size of sample should be based on the accuracy that one seeks for estimation. Generally, the larger size of samples, the more accurate the estimation of means and proportions. Rogreson [36] claimed that, in general, accurate estimates can generally be obtained by choosing sample size according to Eq. 5.

where *n* is the size of sample; *Z* is the confidence intervals, that is, ±1.96 for 95% confidence interval and *W* is the width of the confidence interval.

Using Eq. 5, the total sample size for calibration in the FCUGM, with a 95% confidence interval and width within ±0.02, is ≈9600 samples. Fifty per cent of the total sample data set is randomly selected and used for training the calibration model, while the rest is used to verify the results of training, that is, 4800 cells were used for calibrating the model and 4800 cells were used for verifying the results of training.

## 5. The process of optimising algorithms within the CALIB-FCUGM

The basic theoretical foundation of GA and SA can be found in Refs. [37–40]. This section, however, examines these algorithms in relation to finding an optimal solution from the huge, non-linear and non-differential solution space of the FCUGM.

### 5.1. Genetic algorithms

In relation to GA, Figure 4 shows how the GA works within the CALIB-FCUGM. Prior to starting the GA simulation, however, several decisions (FCUGM and GA parameters) have to be made as shown in Figure 5. After selecting suitable GA parameters, the GA simulation starts by generating an initial random population of a pre-specified number of chromosomes. Each chromosome is a solution out of all of the total potential solutions and is made of a number of genes. Each gene represents one parameter or weight value, which requires calibration. The gene is represented by a number of bits. Given that, Table 3 displays the urban development scenarios in the FCUGM and the number of their genetic characteristics.

Modes and scenarios | Number of genes in each chromosome | Number of bits in each chromosome |
---|---|---|

Mode 1—Scenario 1 | 57 | 570 |

Mode 1—Scenario 2 | 59 | 590 |

Mode 1—Scenario 3 | 57 | 570 |

Mode 1—Scenario 4 | 65 | 650 |

Mode 2—Scenario 1 | 63 | 630 |

Mode 2—Scenario 2 | 69 | 690 |

Mode 2—Scenario 3 | 69 | 690 |

Mode 2—Scenario 4 | 93 | 930 |

Mode 3—Scenario 1 | 99 | 990 |

The GA simulation starts by generating an initial random population (set of solutions) of a pre-specified number of chromosomes. Subsequently, each chromosome (solution) is decoded from bits into a certain value, that is, each parameter or weight is given a number within its bound. This is followed by evaluating the fitness of each individual solution in the initial population by calculating the error (according to Eqs. 1–4) between the UOM, the UCM and reported the NOV. Then, the best solution (lowest NOV value, i.e., BNOV) in this initial population is saved.

To create possible solutions for the next evolution (next population), three types of operators are applied including selection, crossover and mutation. These operators are described in more detail below. By the selection operator, two solutions are randomly selected proportion to their fitness values (based on the probabilistic function of fitness). The lower the NOV value, the more times it is likely to be selected to reproduce in the next generation. Next, the crossover procedure based on the crossover rate combines two solutions from the current evolution to produce two new solutions (offspring or children) for possible insertion in the next evolution. The mutation rules modify the solution by randomly altering one or more of the values of parameters or weights based on the mutation rate. Then, the best solution (lowest NOV value) in this evolution is saved. This iterative process continues until the maximum number of evolutions is performed (termination rule). CALIB-FCUGM checks whether or not the desired number of evolutions are met (termination rule), if not the population of the first evolution will be decoded and the same iterative processes continue. If the desired number of evolutions is met, then the CALIB-FCUGM will stop and evaluate the best solution in each evolution and select the best one and report the results of this solution in a form. The implication of selecting different GA parameters was examined by undertaking empirical experiments on different values of the parameters. Table 4 shows the best control parameters of GA for FCUGM problem, which will be used for all subsequent experiments in this research.

GA parameters | Best options |
---|---|

Population size | Small (50) |

Selection method | Tournament |

Crossover probability | Medium (0.7) |

Crossover method | Single point |

Mutation probability | High (0.2) |

### 5.2. Parallel simulated annealing

Similar to GA, prior to starting the PSA simulation, several decisions should be made as shown in Figure 6. It is worth noting that the PSA differs from the conventional SA in that sets of points (solutions) are run simultaneously in each control parameter rather than one single solution. Figure 7 shows how the PSA works within the CALIB-FCUGM. The PSA simulation within the CALIB-FCUGM starts at a high temperature (control parameter) by generating a number of initial random solutions (Points) of the feasible solutions, each solution denoted as S0. Then, the error between the UOM and the UCM is measured by computing the NOV, the resultant value is denoted as NOV(S0), The lower the value of NOV(S0), the better the solution S0. The objective value NOV(S0) works to minimize the error (MSE, RMSE and meet its constraints).

After calculating the NOV(S0), a small change in the initial solution S0 is brought about using a perturbation mechanism by which two weights or two parameters are randomly selected and their values are exchanged between them. This yields a new solution denoted as S1. Subsequently, a new cost function NOV(S1) is calculated in the same way as NOV(S0). Then, the results of the two objective functions NOV(S0) and NOV(S1) are evaluated. Whether the new solution is accepted or not is based on the following conditions:

If the NOV(S1) < NOV(S0), the objective function has declined (the error decreased) and the new solution S1 is accepted, and the current solution S0 is replaced with new solution, therefore, S0 is set to S1 and S0 = S1.

If the NOV(S1) > NOV(S0), the objective function has raised (the error increased) and is subjected to the metropolis criterion that will accept the new solution S1 according to the probability calculated as, exp((NOV(S0) – NOV(S1))/Ti), and the computed probability is compared to a uniformly distributed random number, R, between 0.0 and 1.0.

If R ≤ exp((NOV(S0) – NOV(S1))/Ti), the new solution is accepted, and the initial solution is replaced with new solution.

If R > exp((NOV(S0) – NOV(S1))/Ti), the new solution is rejected, and the initial solution stays in the same current state.

The preceding process is regarded as an iteration in SA algorithm. This process is repeated until the predefined number of successful moves (SM) in this particular temperature step is met. If the number of SM is met, it implies that a quasi-equilibrium state is reached at this particular control parameter step N and is liable to be reduced by the cooling function and cooling rate ∞ that were predefined. The processes will continue for a new control parameter step *N* + 1 unless the termination rule is met, i.e., the final control parameter Tf = 0.1. At this control parameter value, the algorithm will stop and provide the global optimal solution. The implication of selecting different PSA parameters was examined by undertaking empirical experiments on different values of the parameters. Table 5 shows the best control parameters of PSA for FCUGM problem, which will be used for all subsequent experiments in this research.

PSA parameters | Best parameters options and its values |
---|---|

Initial temperature | Moderate temperature—MT (60%) |

Cooling function | Exponential cooling function—ECF |

Cooling rate | Slow cooling—MC (0.9) |

Number of successful moves | Medium number of successes—MNS (60) |

### 5.3. Expert knowledge

In contrast to GA and PSA, by using EK the proper parameters and weights for the urban model are derived intuitively and empirically rather than automatically. In relation to the urban CA models, most studies calibrate parameters using a trial and error approach that combines the experience of the analyst. For example, in Ref. [29], the weights of urban factors and urban agglomeration are calibrated based on the analyst’s views. The effect of distance decay parameters is calibrated empirically by Cheng and Masser [41] and Ward et al. [42]. In the FCUGM, the EK approach is not entirely based on the analyst’s perspective. The parameters are calibrated on the foundation of the spatial structural analysis as well as the urban planner’s experience. Thus, the calibration is not wholly qualitative in relying on a planner’s view because quantitative results from the initial spatial structural analysis are used. Even so, the large number of parameters in some scenarios makes it very difficult for an expert to derive the proper parameter values.

## 6. Results and discussion

In this section, the FCUGM is calibrated using real data and the meaning of the calibrated values and the consistency of the calibration results, training and accuracy of the validation are discussed. In order to investigate the characteristics and features of the urban growth factors that might generate and affect the urban growth pattern of Riyadh city over the last 18 years, this period was divided into two intervals, namely UGB I and UGB II. The former represents the urban growth between 1987 and 1997, while the latter between 1997 and 2005. This division is not arbitrary; it is approximately the two intervals stated in the Government resolution on Urban Growth Boundary Policy. By calibrating the FCUGM over these two intervals, the authors would be able to assess the results and compare growth trends. The authors argue that combining the two periods (UGBI and UGBII) into one period (UGBI+II), which represents the urban growth between 1987 and 2005, so one can calibrate the model over the 18 years in one time, might provide an insight into changes in urban growth patterns. In this sense, the FCUGM was calibrated for three periods UGBI, UGBII and UGBI+II. The calibration process was carried out on nine different scenarios for each period, which are based on different urban growth factors and different transition rules. Thus, one can examine what are the best scenarios over each period and to what extent they correspond to the best scenarios over other periods.

As mentioned previously, the CALIB-FCUGM produced figures that show the progress of evolution and temperature for the GA and PSA respectively. Figures 8A–J and 9A–J show 90 least-so-far standardized best net objective value (SBNOV) curves, five for each scenario and a mean SBNOV for each scenario using GA and PSA as a result of calibration FCUGM over the period UGB I + II. In terms of the progressive patterns, Figures 8A–J and 9A–J show that the curves are concave, decreasing as the evolution increases in the GA and temperature decreases in PSA, i.e., the SBNOV declines as the evolution and temperature progress. Nevertheless, the degree of decrease and the values of starting and ending of SBNOV are varied from run to run, from one scenario to another and across all of the algorithms. Some curves decrease steeply in the early stages of evolution or temperature, while others decrease constantly in the middle or late stages. The variation in starting points of the GA might be attributed to dissimilar genetic characteristics in the different starting chromosomes. In the PSA, it might be because of the initial random states at different starting points. Broadly, convergence into the global solution (lowest SBNOV) decelerates as the evolution and temperature progress. The variation in ending points (the ends of curves’ tails) of the GA and PSA might be because most runs converge to a narrow extent but generally do not converge altogether. This suggests that some performed better than others did. Some were possibly trapped in local minima.

In relation to the progress of the GA’s evolution against the SBNOV, it can be seen that the SBNOV of the GA decreases in a consistent manner. For example, the SBNOV for most scenarios decreases exponentially with different degrees and little noise, indicating that errors are apparently decreasing as evolution progresses. This suggests the elitism feature of the GA, by which the best chromosome (solution) survives (passes) into the next evolution without any change, is working well. In contrast, the PSA shows considerable variations in the reduction of SBNOV against the PSA’s temperature in different scenarios and modes. One possible reason for this variation is that the computation became stuck in local minima as shown in most scenarios. This is evident in the case of Mode 1—Scenario 2, Mode 1—Scenario 3 and Mode 1—Scenario 4, where the value of SBNOV decreases sharply in the early high temperature (first quarter) but afterwards (over the last three quarters) there was little or even no reduction of SBNOV. With respect to the convergence to the best global solution, it can be seen that most of the scenarios in the GA converged into very low SBNOV, broadly below 0.1, indicating positive performance of the algorithms across most scenarios. The higher convergence to the global solution are presented by Mode 1—Scenario 4, Mode 2—Scenario 4 and Mode 3—Scenario 1, where most evolution curves converge to a very narrow range towards curves’ tails. This supports the argument generated as a result of the uncertainty and sensitivity analysis discussed above, that these three scenarios produced the higher certainty values. It also suggests that the structure of these scenarios and the urban growth factors embedded in them are most appropriate for understanding urban growth processes. The convergence to the best global solution in PSA was, however, varied without any apparent pattern of convergence. The variations were not only evident by scenarios but also by running within a single scenario. For example, Mode 1—Scenario 4 converges to a different solution with different SBNOV in each run, where the SBNOV ranges between 0.1 and 0.5. Thus, it can be deduced that PSA yielded poor solutions with inconsistent convergence in the global solution. However, only Mode 2—Scenario 3 and Mode 2—Scenario 4 showed better convergence into low SBNOV for most of their scenarios.

Figure 10A–F shows a comparison of the mean of Standardized Best Net Objective Value (SBNOV) in terms of all runs, training and validation of the optimum solution found by running CALIB-FCUGM five times for each scenarios using GA, PSA and EK. It can be seen from Figure 8A–F that, while there is some variation, there is broad correspondence in the performance of calibration between the algorithms, in terms of overall accuracy and validation. In terms of algorithm, the GA broadly produces highly consistent results with relatively low variations among different runs for each scenario. It can easily be observed that Scenario 4 in Mode 1, Scenario 4 in Mode 2 and Scenario 1 in Mode 3 account for by the lowest SBNOV generated from different runs. Scenario 2 in Mode 1 and Scenario 3 in Mode 2 yield the worse solution with high SBNOV in most runs.

In contrast, PSA produced relatively inconsistent results, which led to difficulties in observing the accuracy of each scenario. In addition, GA and EK have a similar pattern of accuracy across scenarios, with little variation in magnitude. For example, they gained similar levels of SBNOV accuracy in Mode 1—Scenario 1, Mode 1—Scenario 2, Mode 1—Scenario 4, Mode 2—Scenario 1, Mode 2—Scenario 2 and Mode 3—Scenario 1 but differ slightly in the remaining scenarios. This suggests that the GA was capable to some extent to understand the urban growth process and the underlying relationship between input factors in a way similar to human experts. It also suggests that the two algorithms have similar agreement about the efficiency of scenarios in terms of modelling urban growth. In contrast, the results of the PSA do not show results corresponding to those of the GA or EK. This might suggest that the complexity of the urban process is beyond the algorithm’s capability as will be seen when we come to assess the accuracy of results.

With respect to the accuracy of scenarios, it can be seen that Mode 3—Scenario 1 produced the higher levels of accuracy across all three algorithms, while Mode 2—Scenario 1 generated the worst solution. The high accuracy of Mode 3 might be attributed to the structure of this scenario, which includes three fuzzy variables in each fuzzy rule, that is, each fuzzy rule includes all of the three urban growth factors (TSF, UAAF and TCF). In addition, this high accuracy of Mode 3—Scenario 1 agrees with the results of uncertainty and sensitivity analysis, in which this scenario had the lowest uncertainty compared with others. The worst solution was produced by Mode 2—Scenario 1. This might be related to two factors: (i) the structure of the scenario and (ii) the type of driving forces employed in this scenario (which are TSF and TCF), that is, these two forces are not capable in this scenario of understanding the urban process of Riyadh. The low performance of this scenario is also revealed in the uncertainty and sensitivity analysis, indicating a weakness in structure of this scenario. Figure 11 shows the urban observed map for 1987, 1997 and 2005, while Figures 12–14 show the simulated urban growth during the three periods UGB1: 1987–1997, UGB2: 1997–2005 and UGB3: 1987–2005, respectively, that generated from THE best scenarios: M1—S4, M2—S4 and M3—S1.

In relation to the validation of the calibration results, it can be observed that the GA and EK show validation results that are very close to one another and correspond closely to the training results, whereas the PSA presents lower matching results. For example, the GA and EK have identical training and validation results in all scenarios except Mode 2—Scenario 1 and Mode 2—Scenario 3 in GA and EK, respectively. In the PSA, only four scenarios match the training results including: Mode 1—Scenario 1, Scenario 2, Scenario 4 and Mode 2—Scenario 1, and the remaining five scenarios contradict one another. This implies that the GA and EK are better than the PSA, indicating that they have the capability to work well not only for the data that they trained with but with other data sets. Thus, in terms of generalization, it might be deduced that the CALIB-FCUGM by using the GA or EK can be used to calibrate different data sets from different times and locations.

## 7. Conclusion

In this chapter, theory underlying the CALIB-FCUGM has been applied to calibrate the FCUGM for Riyadh in Saudi Arabia. This chapter can broadly be divided into three main parts: uncertainty and global sensitivity analysis; calibration of the FCUGM; and results and discussion of calibrating the FCUGM. This chapter began by undertaking uncertainty and global sensitivity analysis on the scenarios in the FCUGM, which showed that the different structures of scenarios have different levels of uncertainty. It was found that Mode 3—Scenario 1, Mode 2—Scenario 4 and Mode 1—Scenario 4 generated the best performance, with the lowest uncertainty values, where 90% of the occurrences (iterations) of the Monte Carlo simulation for those scenarios gained the lowest error in terms of the objective function of the CALIB-FCUGM. After that, the technical stages of the calibration of the FCUGM were examined. These included the feasible solution, objective function, experimental design and calibration data set. This was followed by outlining the detailed processes of the optimization algorithms (GA, PSA and EK) within the CALIB-FCUGM. Next, empirical experiments were conducted to investigate the best control parameters of the GA and PSA for the FCUGM problem. It was found that the best GA and PSA parameters for the FCUGM problem had some similarity but differed with respect to problem in geography and non-geography. Finally, the FCUGM was calibrated under nine scenarios over three periods using three optimization algorithms. It was revealed that scenarios Mode 3—Scenario 1, Mode 2—Scenario 4 and Mode 1—Scenario 4 produced the best performance among the nine scenarios; this result is similar to that found in the uncertainty and global sensitivity analysis. The first reason for this is that the driving forces (TSF, UAAF or TCF) were embedded in those scenarios. This indicated that the spatial patterns of urban growth for Riyadh can be better understood by the three forces all together. The second reason can be attributed to the structure of the fuzzy transition rules, for example, Mode 3—Scenario 1, embedded all the three driving forces in each fuzzy rule and produced the most accurate results compared with others scenarios where their rule structure embedded only one or two driving forces.

It was found that the GA followed by EK produced better and more accurate and consistent results compared with PSA. This suggests that the GA was able to some extent to understand the urban growth process and the underlying relationship between input factors in a way similar to human experts. It also suggests that the two algorithms (GA and EK) have similar agreement about the efficiency of scenarios in terms of modelling urban growth. In contrast, the results of the PSA do not show results corresponding to those of the GA or EK. This suggests that the complexity of the urban process is beyond the algorithm’s capability or could be due to being trapped in local optima. Investigation into the CALIB-FCUGM results over different urban growth periods indicated that, where the spatial pattern is more compact, the calibration results are more accurate. The calibration results over the period UGB I + II followed by UGB I produced better results compared with the one over UGB II. This can be understood due to the characteristics of the spatial pattern of urban growth for each period. UGB I+II followed by UGB I experienced edge expansion (relatively compact pattern), while UGB II faced in-filling development (dispersed compact pattern).

To sum up, CALIB-FCUGM was to a large extent able to calibrate the FCUGM over different growth periods under different scenarios using different algorithms. Although some algorithms and scenarios showed average performance, others revealed high capability for calibrating the model well. With this satisfactory calibration of the FCUGM for the urban growth of Riyadh by using CALIB-FCUGM, these calibrated parameters will be passed into the SIM-FCUGM to simulate the spatial patterns of urban growth of Riyadh.

## Acknowledgments

The authors acknowledge with gratitude King Abdulaziz City for Science and Technology for the accomplishment of this work.