Characteristics of the contamination sources and extraction wells.
Characterization of groundwater contamination sources is a complex inverse problem. This inverse problem becomes complicated, due to the nonlinear nature of the groundwater flow and transport processes and the associated natural uncertainties. The mathematical challenges arise due to the nonunique characteristics of this problem resulting from the nonunique response of the aquifer system to a set of stresses and the possibility of instead locating only local optimal solutions. The linked simulation‐optimization model is an efficient approach to identifying groundwater contamination source characteristics. Efficiency and accuracy of the search for optimum solutions of a linked simulation‐optimization depend on the utilized optimization algorithm. This limited study focuses on the application and efficiency of simulated annealing (SA) as the optimization algorithm for solving the source characterization problem. The advantages in using adaptive simulated algorithm (ASA) as an alternative are then evaluated. The possibility of identifying a local optimal solution rather than a global optimal solution when using SA implies failure to solve the source characterization inverse problem. The cost of such inaccurate characterization may be enormous when a remediation strategy is based on the model inferences. ASA is shown to provide a reliable and acceptable alternative for solving this challenging aquifer contamination problem.
- groundwater contamination
- adaptive simulated annealing
- source characterization
One of the efficient methodologies for identifying groundwater contamination source characteristics is the linked simulation‐optimization model . Efficiency and accuracy of the optimal solutions for this type of inverse models, which are often complex, nonlinear, and large scale, depend on the efficiency and accuracy of the optimization algorithm. Simulated annealing (SA) and adaptive simulated annealing (ASA) are two efficient evolutionary optimization algorithms which have been recently applied for solving such a large‐scale nonlinear and complex linked simulation‐optimization models for optimal characterization of unknown contaminant sources in groundwater systems. This chapter discusses the application of these two optimization algorithms and their relative performances in identifying the characteristics of contamination sources in groundwater systems. Adaptive simulated annealing is a modified version of simulated annealing where the optimization parameters are tuned automatically . The advantages of using the ASA algorithm are demonstrated for an illustrative groundwater contamination‐related problem, and the relative efficiency and accuracy of these two optimization algorithms, SA and ASA, are compared. The adaptive algorithm, ASA, is shown to be computationally more efficient and more suitable for searching for a globally optimal solution for a complex nonlinear optimization model representing complex flow and contaminant transport process in a contaminated groundwater aquifer.
Effective groundwater pollution management and remediation depend on identifying the unknown pollution source and reconstructing their release history [3, 4]. The optimal and accurate identification of the contaminant sources plays an important role in modeling of subsurface transport processes and in reducing the long‐term contamination remediation cost. The source identification problem deals with the spatial and temporal variations of the location, activity duration, and the injection rate of the pollutant sources and is mostly inferred from the available sparse and sometimes erroneous concentration measurements at the site. Mainly, source identification includes a simulation problem, such as groundwater flow and pollutant transport models, used to estimate past phenomena or predict future scenarios.
A linked optimization simulation‐based methodology is often the viable and efficient approach for source identification in a regional‐scale aquifer. The unknown contamination source identification in a contaminated aquifer is generally a very complex, ill‐posed, and nonunique problem . The nonuniqueness can be caused by sparsity of field measurement data or due to the inefficiency of the optimization algorithm to reach a global optimal solution. Designed monitoring networks [6–9] can reduce the nonuniqueness related to data availability. However, the nonuniqueness related to the search for a single global optimal solution to the inverse problem depends on the efficiency of the optimization algorithm. Other approach for source identification consists of solving the differential equations backwards in time (inverse problem). The random walk particle method [10, 11], the quasi‐reversibility technique , the minimum relative entropy method , the Bayesian theory and geostatistical techniques , and genetic algorithm [15, 16] are some examples of this approach.
A simulation‐optimization methodology couples the forward time contaminant simulation model with optimization techniques. If an optimization problem is solvable and every minimization sequence converges to a unique answer, it is called stable . This methodology avoids the problem of stability associated with formally solving the inverse problem, but the iterative nature of simulation models usually requires increased computational effort. Many techniques were proposed utilizing coupled simulation‐optimization, and a few representative ones are: response matrix [18, 19], embedded optimization , and linked simulation and ptimization [3, 15, 20].
The two limitations of the response matrix approach are as follows: it is based on the premise that the superposition principle is approximately valid in terms of flow and contaminant transport in the aquifer, and the aquifer parameters must be known and the simulation model must be used to generate the response matrix prior to running the source identification model . Mahar and Datta  showed that the embedding methods need large computer storage and computational time, for large aquifers. Gorelick and Evans  concluded that numerical difficulties are likely to arise for large‐scale problems using the embedding technique.
To conduct unknown pollutant source characterization in large‐scale aquifers and real areas, linked simulation‐optimization methodology was proposed. In this methodology, the numerical models for simulation of the flow and transport process are internally linked to the optimization algorithm. Chadalavada and Datta  and Amirabdollahian and Datta  presented an overview of the pollution source identification simulation‐optimization approaches.
The linked simulation‐optimization model is an efficient and effective technique to characterize the contaminant sources by internal linkage between flow and contaminant transport simulation models and the selected optimization technique. This methodology can solve contaminant source problems in fairly large study areas.
Evolutionary optimization algorithms have made it possible to solve complex linked simulation‐optimization models, which are difficult to solve, or difficult to even obtain feasible solutions, when utilizing classical optimization tools. Moreover, there is less limitation in mathematical definition of objective function and constraints compared to former optimization algorithms such as linear programming . Finally, evolutionary algorithms can optimally identify relatively large number of decision variables , and utilization of the evolutionary optimization algorithms simplifies the linking process. Examples of the evolutionary optimization algorithms include: genetic algorithm (GA) , tabu search (TS) , simulated annealing (SA), adaptive simulated annealing (ASA) , and differential evolution algorithm . Yeh  and Datta and Kourakos  presented an overview on various optimization methods coupled with simulation techniques utilized for groundwater quantity management, and quality management, respectively.
In a linked simulation‐optimization approach, the optimization algorithm is used as an efficient search tool and the accuracy and efficiency of the methodology depend on the selected optimization algorithm. In groundwater contaminant source characterization problems, even when there are no errors or uncertainties associated with the inputs and parameter estimates for the physical process simulation model, there may not be a unique solution to the inverse problem due to nonunique physical response of the system. The ill‐posed nature of the inverse problem is a different issue and is predominantly due to sparsity of measurement data, which can be addressed by designing and implementing a suitable concentration monitoring network . The ill‐posed nature of the inverse problem and the plausibility of nonunique solutions can be interrelated as well. More efficient monitoring networks can reduce the plausibility of nonunique solutions, and therefore, optimal monitoring network design is a related issue [9, 22]. Also, only if the global optimum solution is found, it may represent the accurate solution to the source identification problem. Nonuniqueness in the system response may introduce alternate optimal solutions, although each globally optimal [5, 20, 21]. This is due to the fact that different sets of stresses (i.e., contaminant sources) can result in identical responses (resulting spatial and temporal concentrations). Therefore, nonuniqueness is inherent, independent of errors in parameter estimates or measurements. In addition, if the optimal solution is not the global solution, it itself introduces nonuniqueness due to local optimal solutions being wrongly identified as the optimal solution of the inverse problem. In the optimal contaminant source identification process, this global optimum generally represents the actual contaminant source characteristics. Failure to identify the global optimal and the plausible local optimal solutions introduces nonuniqueness in the solution space. Therefore, efficiency of the optimization algorithm to reach a global optimum solution, or near optimal solution, is crucial to accurate source identification. Efficiency of algorithms like ASA can be extremely useful especially when compared to SA which needs tuning of optimization parameters, hence rendering the search for a global optimal somewhat subjective or more uncertain.
In this chapter, two evolutionary optimization algorithms are described: SA and ASA. Simulated annealing (SA) approaches the optimization model like a bouncing ball, which bounces over mountains from valley to valley. The SA controlling parameter is temperature (T) which mimics the effect of fast moving particle in a hot object like hot molten metal; as the T decreases and reaches relatively colder states, the height of the ball bounce decreases and it settles gradually in the deepest valley. To reach the optimal solution, there are many parameters which need to be tuned, such as probability density function, acceptance probability density function, and re‐annealing temperature schedule. ASA is a variant of SA in which the automated re‐annealing temperature schedule and random step selection make the algorithm less sensitive to the user‐defined parameters. One of the issues in selecting a suitable and efficient optimization algorithm for solution of an optimization model is the likelihood of reaching a global optimum solution. It has been shown that SA is relatively more efficient in reaching a better optimal solution compared to GA . The added advantage of using ASA is the elimination of the requirement to choose all the relevant optimization parameters appropriately, a process very much dependent on the structure and nature of the optimization model to be solved. ASA also eliminated the need for several trial executions of the model, to adjust the parameters . Therefore, the possibility of reaching a global optimal solution faster is also enhanced by utilizing ASA. Very fast simulated re‐annealing (VFSR) developed in 1987 is the first version of ASA . Ingber and Rosen  showed that VFSR is about one order of magnitude faster than GA in convergence speed and is more likely to find the global optimum.
This study investigates the applicability of ASA to unknown groundwater contaminant source release history reconstruction problem and compares its performance to SA‐based solutions. The performance evaluation of linked simulation‐optimization approach is based on a realistic scenario where contaminant concentration measurements are available a few years after the sources have ceased to exist. Apart from the convergence speeds, the two algorithms are compared for their performance in recovering accurate source release histories in terms of source location, magnitude, and duration of activity.
The pollutant source characteristics which are required to be addressed in an identification procedure are: (1) source release history (time); (2) source locations; and (3) source flux magnitudes. The linked simulation‐optimization model consists of an optimization algorithm which specifies the candidate source characteristics. A simulation model which is linked to the optimization algorithm uses the candidate characteristics to simulate the contaminant concentration at monitoring locations at various time intervals. The optimization algorithm is used to minimize the objective function representing the differences between measured concentrations and simulated ones. This process evolves until the algorithm reaches the optimal solution or a specified stopping criterion. This methodology for identification of unknown groundwater contamination sources has two major components: numerical groundwater flow and transport simulation models and linked simulation‐optimization model.
3.1. Groundwater flow and transport simulation models
Groundwater flow simulation model used in this study is MODFLOW‐2000 . MODFLOW is a computer program that numerically solves the three‐dimensional ground water flow equation for a porous medium by using a finite‐difference method. The partial differential equation for transient groundwater flow utilized in MODFLOW is given by the following equation :
where Kxx , Kyy and Kzz are the hydraulic conductivities (L/T) along the x, y, and z coordinate axes which are assumed to be parallel to the principal axes of hydraulic conductivity, respectively. H, SS, and t are the potentiometric head (L), the specific storage of the porous material (L-1), and time, respectively. W is the volumetric flux per unit volume representing sources and/or sinks of water, where W < 0 for flow moving out of the groundwater system, and W > 0 for flow moving in (T-1). When combined with boundary and initial conditions, Eq. (1) describes transient three‐dimensional groundwater flow in a heterogeneous and anisotropic medium, assuming that the principal axes of hydraulic conductivity are aligned with the coordinate directions.
The contaminant transport simulation model which is used in this study is chosen as MT3DMS . The partial differential equation describing three‐dimensional transport of contaminants in groundwater can be written as follows :
where C is the solute concentration in groundwater (ML-3); t is the time (T); j and k are the Cartesian coordinates along axes; uj is the groundwater velocity (LT-1); Dj,k is the dispersion coefficient tensor (L2T-1); qi is the flux of contaminant concentration for source i (MT-1); and are chemical reaction terms (ML-3T-1).
The MT3DMS transport model uses a mixed Eulerian‐Lagrangian approach to the solution of the three‐dimensional advective‐dispersive‐reactive equation . The groundwater velocity values (uj), estimated by the flow model, are used by the transport model to estimate concentration values. The estimated concentrations (C) are transferred to the optimization model (in the linked simulation‐optimization approach) to evaluate the objective function value.
3.2. Linked simulation‐optimization method
For completeness, a brief description of the formulation for the linked simulation‐optimization source identification framework is presented. More details can be found in Mahar and Datta . Based on the available background information about the site, the set of potential contaminant source locations is assumed to be known. The optimization model estimates the optimal contaminant fluxes associated with each potential source location at each stress period. The objective function minimizes the weighted sum of normalized differences between temporal and spatial observed and simulated concentrations subject to upper and lower bounds on source fluxes. The optimal source identification model is defined by the objective function 3, subject to the constraints 4 and 5.
where nk, nob, and N are the total number of concentration observation time periods, available monitoring locations, and candidate source locations, respectively. and are the concentration estimated by the simulation model and observed concentration at observation location iob and at the end of time period k, respectively. D, HC, and θ are the dispersion coefficient, hydraulic conductivity, and porosity, respectively. xi, yi, zi, and qi are the Cartesian coordinates of candidate contaminant source i and the contaminant release flux for candidate location i, respectively. is the upper bound for contaminant release fluxes.
The constraint set 4 represents the flow and contaminant transport simulation models, and it couples the simulation model and optimization algorithm. Eq. (5) limits the candidate contaminant flux values, at each potential location, to an upper bound. In Eq. (3), α is a constant, which is sufficiently large to prevent any individual term in Eq. (3) becoming indeterminate due to the observed value of concentration becoming very small. Adding this parameter variable also prevents domination of the obtained solution by deviation between measured and simulated concentrations corresponding to low concentration measurement values . Figure 1 shows a schematic representation of the linked simulation‐optimization source identification process using evolutionary optimization algorithms.
4. Performance evaluation
A three‐dimensional transient illustrative groundwater contamination problem is utilized to compare the efficiency and accuracy of SA and ASA optimization algorithm in the context of this research. First, the illustrative groundwater system is defined. Arbitrary monitoring locations are selected, and flow and transport simulation models are used to estimate contaminant concentrations at monitoring locations and times. Specifically for the performance evaluation purpose, these simulated concentration measurement values are to be used as observed concentration in the linked simulation‐optimization source identification model. Using this performance evaluation procedure with synthetic data for an illustrative example facilitates the comparison between the application of SA and ASA optimization algorithms, without the need to consider the reliability of model properties, measurement accuracies, and parameters estimation errors.
4.1. Illustrative groundwater contamination problem
The performance of the proposed methodology is evaluated for a three‐dimensional illustrative groundwater aquifer study area. Figure 2 shows the plan view of the three‐dimensional study area measuring 1500 m × 1000 m × 36 m and consisting of two unconfined layers. The top, bottom, and left side boundaries have a specified head, and the right‐hand side boundary has variable head boundary conditions. The triangular signs show the location of active extraction wells (sinks). The candidate contaminant source locations are shown by square signs. Two of them are actual sources and one is a dummy. The dummy (not actual) source is introduced to evaluate the accuracy of the proposed methodology in correctly identifying actual sources. Twelve concentration monitoring locations are specified. It is assumed that the contaminant source fluxes are the same in every stress period. The study period is divided into five stress periods. Table 1 shows the length of stress periods and the extraction wells and contaminant sources’ properties. The field hydrogeological parameters are given in Table 2 .
|183 days||183 days||183 days||183 days||2196 days|
|Flow rate (L/day)||23||16||1||500|
|Number of cells in x‐direction||–||20|
|Number of cells in y‐direction||–||30|
|Number of cells in z‐direction||m||2|
|Horizontal hydraulic conductivity (1st and 2nd layers)||m/d||25, 18|
|Vertical hydraulic conductivity||m/d||3|
|Horizontal transverse dispersivity||m||2|
|Vertical transverse dispersivity||m||1|
|Initial contaminant concentration||ppm||0|
|Upper and lower bounds for source fluxes||g/s||0–100|
4.2. Contaminant source identification process
For the purpose of identifying contaminant source characteristics, all sources are considered to be active during all five stress periods, and the pollutant flux from each of the sources is assumed to be constant over a specified stress period. In order to evaluate the model performance, one dummy (not actual) source is also introduced as a potential source. Therefore, the source identification decision variables are the contaminant fluxes at three potential source locations for each stress period, and in total, there are 15 decision variables to be identified.
Since the performance is evaluated using illustrative problem, flow and transport simulation models are utilized to find contaminant concentration at monitoring locations using the actual source fluxes. These values are used in the linked simulation‐optimization process as observed concentrations. The initial source fluxes are set to 0 for all sources and stress periods. In real‐life contaminant source identification problems, the observed concentrations collected in the field will be used to find optimal source characteristics.
5. Results and discussion
The applicability of these two algorithms is compared in terms of efficiency and accuracy. The run time and number of generations are utilized to compare the efficiency of the algorithms. Moreover, the estimated source characteristics are compared with the actual properties in order to compare algorithms in terms of accuracy. To examine the capability of both models in terms of accuracy in estimating source fluxes, the normalized absolute error of estimation (NAEE%) is estimated using Eq. (6) :
where N is the number of stress periods and and are the estimated and actual source fluxes for stress period i, respectively.
Table 3 presents the SA and ASA optimization parameters. Every iteration of SA‐ and ASA‐based methods uses one run of the groundwater transport simulation model (MT3DMS). Irrespective of the optimization algorithm, the execution time for one transport simulation run depends on the computation platform. In order to have a comparison between methods, independent of the utilized computation platforms, both methods are compared based on the number of simulation runs which is directly proportional to the computational time.
|Error tolerance for termination||–||0.01|
|Objective function multiplier||–||100|
|Lower bound for source fluxes||g/s||0|
|Lower bound for source fluxes||g/s||100|
|Initial source fluxes||g/s||0|
In real‐life groundwater contaminant source identification problems, the source characteristics are unknown. Therefore, there is no information available to measure the accuracy of linked simulation‐optimization methods. The accuracy and efficiency of SA depend to a large extent on the selected SA optimization parameters. However, due to unknown source characteristics, sensitivity analysis and tuning SA parameters are not possible or very difficult. To compare SA‐ and ASA‐based methods, SA with initial temperature (T) 1.0E8 and temperature reduction factor (TR) 0.5 is selected. Figures 3 and 4 compare the estimated against actual fluxes for sources 1 and 3, respectively. NAEE% of the estimated fluxes using ASA and SA models (T = 1.0E8, RT = 0.5) is 22.5 and 75%, respectively. Both methods identified the dummy source (not an actual source but introduced as a potential source for evaluation purpose). As shown in Figures 3 and 4 , this particular set of SA parameters represents one particular SA solution highlighting the comparative inefficiency of SA.
Figure 5 shows the convergence profiles for both SA‐ (T = 1.0E8, RT = 0.5) and ASA‐based models. Minimum value of the objective function achieved is plotted against number of the transport process numerical simulation models. Figure 5 shows that ASA converges faster to the smaller objective function values (in the minimization problem), compared to the utilized SA model. Although, as Figure 5 shows, the SA‐based model converges to very small objective function values, the corresponding estimated NAEE% (75%) is large. This shows that SA‐based solutions seem to get trapped in a local optimum and did not find or get close to the global optimum. This may be due to the nonunique nature of the local optimal solutions as well, that is, the obtained solution matches the observed and simulated concentrations for a different set of sources not representing actual sources. The objective function is very small, even though the accuracy of source estimates is very poor. Figure 5 shows that the objective function improvement rate decreases after 10,000 simulation runs. Next, sensitivity analysis is conducted to find suitable SA parameters. A set of 10,000 simulation runs is selected as maximum number of simulation runs.
For the performance evaluation purposes, a sensitivity analysis is performed to find suitable SA parameters. This sensitivity analysis, so‐called artificial, is not possible in real‐life contamination problems. In this chapter, illustrative example study area is selected with synthetic aquifer data. Therefore, SA parameters are tuned by comparing estimated and actual release fluxes. This step is not possible in real‐life scenarios where the source release fluxes are unknown. These evaluation results only show that SA can deliver solution results comparable to the ASA solutions if the SA parameters could be optimally tuned based on sequential matching of desired and obtained solutions, an impractical scenario.
Two parameters, initial temperature and temperature reduction factor, in order to find the sensitivity of SA models to the optimization parameters. Since the objective of this chapter is to compare the performance of SA‐ and ASA‐based models, an initial execution of ASA is used to find desirable number of simulation runs. Using this initial model execution, 10,000 simulation runs are selected. Figure 6 shows the resulted NAEE% using different SA parameters with the same number of maximum simulation runs (10,000). The least NAEE% is associated with 1000 as initial temperature (T) and 0.3 as the temperature reduction factor (TR).
In order to compare the accuracy and convergence of the tune SA‐ and ASA‐based models, both models are executed with unconstrained time of run. Error tolerance of estimation is set as 0.01 for both methods. NAEE% of the estimated fluxes using ASA and tuned SA (T = 1.0E3, RT = 0.3) models is 22.5 and 16.5%, respectively.
It can be inferred from the results that both methods are able to correctly identify the dummy source. A zero or near zero estimation of the dummy source implies correct identification. Solution results obtained using tuned SA parameters recovered source 1 release fluxes more accurately compared to the other SA‐based solutions. Solutions obtained utilizing the ASA algorithm recovered source 3 release fluxes more accurately. Apparently, both methods resulted in relatively similar accuracy in recovering source release fluxes in terms of location and magnitudes. The tuned SA method solution results are slightly superior to ASA method considering accuracy, although the errors are similar in magnitude. As shown in Figure 6 , all other sets of SA parameters (the ones tested in sensitivity analysis process) resulted in higher NAEE% compared with ASA method (25%). This shows that the performance of the SA‐based method is highly reliant on the selection of its parameters which limits its applicability in real‐life scenarios.
It can be argued that the poor performance of various SA‐based models reported in Figure 6 may have resulted from limited number of simulation runs. The purpose of this chapter is to compare ASA‐ and SA‐based models based on both accuracy and convergence speed. Therefore, improved accuracy with the cost of relatively larger execution times is beyond the scope of this chapter. However, to give an insight to the readers, the SA model with T = 1.0E10 and RT = 0.9 is tested. The NAEE% associated with the estimated fluxes is 0.65 which was achieved after 103,876 simulation runs. Figure 7 shows the T values and corresponding minimum objective function values.
Therefore, longer execution times can improve the results using non‐tuned SA‐based model. Figure 7 shows that faster convergence happens when T reaches cooler states. This potentially demonstrates that very high T values would not have substantial positive effect on finding the optimal values. More rigorous studies are required to make a definitive conclusion about the SA parameters optimization without any time (computational costs) constraints.
Characterization of unknown sources of groundwater pollution, especially when field concentration measurement data are sparse and arbitrary, remains a challenging problem. The linked simulation‐optimization method to solve the inverse problem of unknown groundwater contamination source characterization problem is utilized. The performance of two evolutionary optimization algorithms, SA and ASA, in terms of accuracy and convergence speed is evaluated. It is applied to an illustrative contaminated aquifer study area. Evaluation results show that suitable SA parameters need to be selected based on the nature of the problem to be optimized. Performance evaluation shows that the ASA‐based method estimates the source release fluxes more accurately and convergences to a smaller objective function value (in a minimization problem) faster than non‐tuned SA‐based method.
In this limited study, an illustrative study area was selected with synthetic hydrogeology and contamination data specified for evaluation of the solution results. Therefore, just for comparison and performance evaluation purpose, SA parameters are tuned by comparing estimated and actual release fluxes. This practice is not possible in real‐life scenarios where the source release fluxes are unknown. Without synthetic data, and simulation results to represent field concentration measurements, such tuning with prior knowledge will be impossible. Therefore, in real‐life scenarios, the SA performance cannot be controlled by tuned SA parameters. This limits the application of SA‐based models. The non‐tuned SA model converged to poor results even with unconstrained computational time. Results demonstrated that non‐tuned SA might not converge to near optimum results, even with a large number of iterations, and it may be trapped in the vicinity of local optimal solutions. This is due to nonunique nature of the groundwater contamination source characterization problems. Since tuning SA parameters in real‐life scenarios is hard or impossible, utilizing SA may lead to wrong source flux estimation. The wrong estimation of source fluxes is not verifiable in real life and may lead to wrong decisions about management and remediation of the contaminated area.
The solution results obtained by an SA‐based model with tuned parameters and solutions obtained by ASA‐based model show that they have relatively similar performance concerning both accuracy and convergence speed. Moreover, the need to tune SA parameters will substantially limit its application in groundwater source identification problems. The tuning trial and error process increases the total computational costs of the linked simulation‐optimization process. Therefore, the application of ASA in linked simulation‐optimization‐based groundwater source identification models results in substantial savings in computational time and potentially results in more accurate results. This inference is critical for designing, effective, and efficient contaminated aquifer remediation strategies which are often very costly, and cost of failed remediation strategies, caused by inaccurate characterization of unknown contamination sources, can be enormous. ASA is shown to provide a reliable and an acceptable alternative for solving this challenging aquifer contamination problem.