Details remarks to some specific features of Figures 3 and 4.

## Abstract

Uncertainties, such as soil parameters variability, are often encountered in embankment dams. Probabilistic analyses can rationally account for these uncertainties and further provide complementary information (e.g., failure probability and mean/variance of a model response) than deterministic analyses. This chapter introduces a practical framework, based on surrogate modeling, for efficiently performing probabilistic analyses. An active learning process is used in the surrogate model construction. Two assessment stages are included in this framework by respectively using random variables (RV) and random fields (RF) for the soil variability modeling. In the first stage, a surrogate model is coupled with three probabilistic methods in the RV context for the purpose of providing a variety of useful results with an acceptable computational effort. Then, the soil spatial variability is considered by introducing RFs in the second stage that enables a further verification on the structure reliability. The introduced framework is applied to an embankment dam stability problem. The obtained results are validated by a comparison with direct Monte Carlo Simulations, which also allows to highlight the efficiency of the employed methods.

### Keywords

- embankment dam
- slope stability
- reliability analysis
- sensitivity analysis
- random field

## 1. Introduction

According to the International Commission of Large Dams (ICOLD) database updated in September 2019 [1], there are around 58,000 large dams (higher than 15 m) over the world and 75% of them can be classified as embankment dams. Concerning all the constructed dams, the number is much more important. For example, over 91,460 dams were operated across the United States in 2019 [2] and the majority is rock-filled or earth-filled ones. Therefore, safety assessment of embankment dams is crucial for engineers considering their great population and the considerable damages that can be induced by their failures. However, embankment dams involve a high degree of uncertainties, especially for their material properties [3] since they are constructed by natural materials (soils, sands, or rocks), which makes their safety evaluation a difficult task. Probabilistic analysis [4] is an effective solution which permits to rationally account for the soil variabilities and quantify their effects on the dam safety condition by using a reliability method or a sensitivity method. Additionally, complementary results [5] can be provided by a probabilistic analysis compared to a traditional deterministic assessment, including the failure probability (

In a probabilistic analysis, uncertainties of soil properties can be represented by random variables (RVs) or random fields (RFs) [4]. The former is simpler and easier to couple with a deterministic model [4]. In the RV approach, the soil is assumed to be homogeneous but different values are generated in different simulations for one soil property according to a given distribution. Therefore, the RV method cannot explicitly account for the soil spatial variabilities. On the contrary, the RF approach can model the spatial variation of soils. For a soil property in one simulation, one RF, meaning a collection of different values in a discretized grid, is generated according to the soil parameter statistics and a given autocorrelation structure. However, this approach is more complex and needs extra computational efforts (e.g., quantification of the autocorrelation distances and generation of RFs) compared to the RV one. Figure 2 illustrates the principle idea of the two approaches.

In this chapter, a practical framework is proposed to efficiently perform the probabilistic analysis of embankment dams. The RV and RF approaches are both implemented into the framework, corresponding to two assessment stages. The RV approach permits a quick estimate on the target results (e.g.,

## 2. Presentation of the used probabilistic analysis methods

This section aims to briefly present the probabilistic analysis methods used in the proposed framework including two reliability methods (MCS and FORM), a surrogate modeling technique (PCE), a global sensitivity analysis method (Sobol) and a RF generation approach (KLE).

### 2.1 Monte Carlo simulations (MCS)

The MCS offers a robust and simple way to estimate the distribution of a random model response and assess the associated

where

It is important to mention that the

### 2.2 First order reliability method (FORM)

The FORM estimates the

where

where

### 2.3 Polynomial Chaos expansions (PCE)

The PCE is a powerful and efficient tool for metamodeling which consists in building a surrogate of a complex computational model. It approximates a model response

where

In order to further reduce the number of

### 2.4 Sobol-based global sensitivity analysis (GSA)

The GSA aims to evaluate the sensitivity of a Quantity of Interest (QoI) with respect to each RV over its entire varying range. Among many methods for performing a GSA, the Sobol index has received much attention since they can give accurate results for most models [8]. The Sobol-based GSA is based on the variance decomposition of the model output . The first order Sobol index is given as:

where

where

It is noted that the Sobol index is only effective for independent variables. In order to properly account for the input correlation effect, the Kucherenko index [5] can be employed which is also based on the variance decomposition. For the estimation of the Sobol or Kucherenko index (First order and total effect), the traditional way is to use the idea of MCS however it requires a high number of model evaluations.

### 2.5 Karhunen-Loève expansions (KLE)

A random field (RF) can describe the spatial correlation of a material property in different locations and represent nonhomogeneous characteristics. The KLE, as a series expansions method, is widely used in the geotechnical reliability analysis since it can lead to the minimal number of RVs involved in a RF discretization [7]. In the KLE context, a stationary Gaussian RF

where

where

## 3. The introduced framework

This section presents the introduced framework for the probabilistic analysis of embankment dams. A flowchart of the framework is given in Figure 3.

At the beginning, three elements should be prepared. Firstly, the distribution type and the related parameters (e.g., mean and variance) of the concerned material properties have to be determined. It will allow describing their uncertainties by means of RVs. The selected material properties should be relevant to the QoI of the problem. In case of it is difficult to properly select the relevant properties, all the possible properties can be considered for the RV modeling. The Global Sensitivity Analysis (GSA) which will be performed in the first stage can help to understand the significance of each property. With the GSA results, one can then select which properties will be modeled by RFs. The second work is to develop a deterministic computational model by using numerical or analytical methods (e.g., Finite element method and Limit analysis method). The objective of this model is to estimate the QoI with a given set of input parameters. Then, the autocorrelation structure of the concerned properties should be determined. This structure, defined by an autocorrelation function and the autocorrelation distances, allows to describe the spatial correlation between different locations of a property. It is a key element in the generation of RFs. After these three preparation-works, the analyses in the two stages can be performed. It should be noted that the focus of this chapter is to show the benefits of a probabilistic analysis and demonstrate the proposed framework. Concerning the way of rationally determining the distribution parameters and the autocorrelation structure by using the available measurements, readers can refer to [10, 11].

The objective of the first stage is to provide a variety of probabilistic results with an acceptable computational burden. The results could be helpful to analyze the current problem in a preliminary design phase and guide the following site investigation program and the next design assessment phase. In this stage, the RV approach is used to consider the input uncertainties. It allows quickly having a first view on the target results given that this approach can be easily coupled with any deterministic model and any probabilistic analysis method. Three analyses are performed in this stage by using respectively three techniques: two reliability methods (MCS and FORM) and one sensitivity method (Sobol-based GSA). The MCS is always considered as a reference method to evaluate other reliability methods due to its robustness. Therefore, an MCS is conducted here in order to obtain an accurate estimate on the ^{4}. Besides, an active learning process [12] given in Figure 4, is used to construct the SPCE model. This process starts with an initial DoE and gradually enriches it by adding new samples. A new SPCE model is created each time after the DoE updated with new samples. This process is stopped when some criteria are satisfied. This algorithm is more efficient than the metamodel training based on a single DoE and can give accurate estimate on the

The second stage aims to consider the spatial variation of the concerned properties which are ignored in the previous stage. It can thus provide a more precise

Figure 3 | 1. | The procedure of Figure 4 is followed to create an SPCE model |

2. | For independent RVs, the Sobol index is used For correlated RVs, the Kucherenko index is used | |

3. | The RFs are generated by the KLE Conditional RFs can be used if knowing the measurements’ locations | |

4. | The SIR is used to reduce the input dimension An SPCE is created in the reduced space | |

Figure 4 | a. | The size Latin Hypercube Sampling (LHS) is used for generating samples |

b. | There is no standard value to check if the problem is high dimensional or not. For the SPCE model, it can easily handle 10–20 RVs. It is thus better to reduce the dimension if | |

c. | An important parameter in the SIR is the slice number 10 ≤ Nsir ≤ 20 for the cases with several hundred RVs [9] 20 ≤ Nsir ≤ 30 when the number of input RVs is several thousands | |

d. | The algorithm presented in [9] is used to create an SPCE The SPCE optimal order is determined by testing in a range | |

e. | Stopping condition 1 measures if the accuracy indicator | |

f. | Stopping condition 2 evaluates the convergence of the | |

g. | ||

h. | An MCS population is generated using the LHS as a candidate pool | |

i. | DoE is updated by adding the new samples and their model responses |

## 4. Application to an embankment dam example

This section shows an application of the proposed framework to an embankment dam stability problem. The dam initially proposed and studied in [5, 13] is selected for this application.

### 4.1 Presentation of the studied dam and deterministic model

The studied dam is given in Figure 5. It has a width of 10 m for the crest and a horizontal filter drain installed at the toe of the downstream slope. The soil is assumed to follow a linear elastic perfectly plastic behavior characterized by the Mohr Coulomb shear failure criterion. In this work, the dam stability issue will be analyzed by considering a constant water level of 11.88 m and a saturated flow. Additionally, a horizontal pseudo-static acceleration of 2.16 m/s^{2} toward the downstream part is applied on the dam body. This value represents a relatively high seismic loading and is determined by referring to the recommendations given in [14] for a dam of category A with a soil of type B.

Concerning the input uncertainty modeling, three soil properties (density

Soil property | Distribution | Mean | CoV | Correlation coefficient | ||
---|---|---|---|---|---|---|

^{3}) | Lognormal | 19.8 | 5% | 40 | 8 | |

Lognormal | 8.9 | 30% | −0.3 | 40 | 8 | |

Lognormal | 34.8 | 10% | 40 | 8 |

The deterministic model used in this work for estimating the dam FoS is developed by using the idea of [13]. It combines three techniques: Morgenstern Price Method (MPM), Genetic Algorithm (GA) and a non-circular slip surface generation method. MPM is employed to compute the FoS of a given failure surface; GA aims at locating the most critical slip surface (i.e., minimum FoS) by performing an optimization work; The implementation of non-circular slip surfaces can lead to more rational failure mechanics for the cases of non-homogeneous soils. The principle of the model is to firstly generate a number of trial slip surfaces as an initial population, and then to determine the minimum FoS value by modeling a natural process along generations including reproduction, crossover, mutation and survivors’ selection. The distribution of the pore water pressures inside the dam is given by a numerical model [5]. In this work, the developed deterministic model is termed as LEM-GA. Using a simplified deterministic model (e.g., LEM-GA) is beneficial for a reliability analysis since it can reduce the total computational time. This strategy can thus be adopted in a preliminary design/assessment phase for efficiently obtaining first results. Then, a sophisticated model (e.g., Finite element model) is required in a next phase if complex conditions should be modeled (e.g., rapid drawdown and unsaturated flows) or multiple model responses (e.g., settlement and flow rate) are necessary.

### 4.2 First stage: RV approach

This section shows the conducted works at the first stage of the proposed framework and presents the obtained results. The RV approach is used in order to have a quick estimate on the dam reliability and the contribution of each input variable. The joint input PDF is defined by the mean, CoV and

Firstly, an SPCE surrogate model is constructed as an approximation to the model LEM-GA. It is achieved by using the procedure of Figure 4 with the following user-defined parameters:

Figure 6 presents the results provided by the SPCE-aided MCS with ^{5} FoS values, its PDF and CDF can be plotted. The PDF shows that the dam possible FoS under the current calculation configuration mainly varies between 1 and 1.6 with a mean (

In summary, this stage provides a first estimate on the dam

### 4.3 Second stage: RF approach

The second stage of the proposed framework is to consider the soil spatial variability by RFs and obtain a more precise

The first step in this stage is to determine the truncation term number

The second step is to create an SPCE model to replace the LEM-GA coupled with RFs. The active learning process of Figure 4 is followed for the SPCE training with the user-defined parameters given as: *a priori* the SPCE construction at each iteration with the current DoE. This is because that the considered reliability analysis is a high dimensional problem which has 251 input RVs. Directly training an SPCE with the original input space will require a large size of DoE and may lead to a less accurate meta-model. By performing an SIR with a slice number of 20, the input dimension is reduced from 251 to 19. Then, it is possible to create an SPCE model with respect to the 19 new RVs using an acceptable size of DoE (e.g., several hundred). At the end, the obtained SPCE is a 2-order model with a

The last step is to perform an MCS with the determined SPCE model. The obtained results are presented in Figure 9. The dam FoS mainly varies between 1.1 and 1.5 with a mean of 1.276 and a standard deviation of 0.102. The dam ^{−4}. Compared to the analysis of the first stage, the current analysis leads to a clearly reduced

### 4.4 Parametric analysis

It needs in some cases to perform a series of parametric analyses. The objective is to evaluate the effects of some parameters which are difficult to be precisely quantified due to the lack of enough measurements. The physical range recommended in literature for the concerned parameters can be used to define some testing values. In the proposed framework, the computational burden for conducting such parametric analyses is acceptable since the use of the SPCE model significantly reduces the consuming time of one probabilistic analysis. In this work, the effects of two parameters on the dam reliability are investigated: the cross-correlation between

Figure 10 presents the obtained results of the parametric analysis (1A, 1B, and 1C) for the

Figure 11 shows the results for the investigation on the

## 5. Discussions

### 5.1 Validation of the surrogate-based results

The proposed framework is based on the metamodeling to perform a probabilistic analysis. Therefore, the key element of the proposed framework is to create an accurate SPCE model which can well replace the original computational model. In the next paragraph, two recommendations are given for a good metamodeling.

Firstly, it is recommended to use a space-filling sampling technique (e.g., LHS) to generate samples from a given PDF for the initial DoE and the MCS candidate pool. This allows generating a set of samples which can reasonably cover the input space. The LHS is also faster than a purely random sampling technique for the result convergence in an MCS. Secondly, an active learning process, such as the one of Figure 4, is highly suggested for the SPCE construction. The process is stopped only if stable

Concerning the validation of the constructed surrogate model, three solutions are provided here. The first one is to use the available results in the DoE to compute an accuracy indicator for the meta-model, such as the

In this section, the third validation solution is adopted since some parametric analyses are carried out and the employed deterministic model (LEM-GA) is not too time-consuming. Two cases (1B and 2A) are selected for the validation and are analyzed by a direct MCS in this section. The

Table 4 gives a detailed comparison between the two methods in terms of

### 5.2 Practical applications

This section provides a discussion on some issues of a probabilistic analysis. The objective is to help engineers to better implement the proposed framework into practical problems.

#### 5.2.1 Probabilistic analysis tool

Probabilistic analysis has received much attention during the last decade in literature. However, it is still not widely applied in practical engineering problems. One major reason which hinders its application in practice is the complexity of performing a probabilistic analysis including understanding/programming a reliability method, RF generation and couple them with a deterministic model. This problem is being addressed in recent years with the establishment of many probabilistic analysis tools. A variety of reliability/sensitivity methods are available in these tools and can be linked with a computational model developed in a third-party software. Examples of these tools include UQlab in Matlab and OpenTURNS in Python. A review of the structural reliability analysis tools can be found in [15]. Using a well-checked tool to perform the probabilistic analysis of practical engineering problems can also avoid personal programming mistakes which could lead to inaccurate results.

#### 5.2.2 Reliability method selection

The proposed framework is based on the SPCE surrogate model. The SPCE is adopted since it has been widely and successfully used in many studies of geotechnical reliability analysis [9, 13, 16]. Some techniques were proposed to be coupled with SPCE in order to efficiently consider the cases with RFs [17], so the SPCE can also handle high dimensional stochastic problems. However, the proposed framework is not limited to the SPCE. It can be updated by using another metamodeling technique (e.g., Kriging and Support Vector Machine) with some necessary modifications. Besides, for estimating a very low ^{−6}), the SPCE-aided MCS could be time-consuming given that generating a great number of samples (e.g., ^{8}) and operating them requires a big memory in a PC. To tackle this problem, the SPCE can be coupled with other reliability methods in order to alleviate the computational burden. The Subset Simulation (SS) [6, 18] is a good choice to replace the MCS for the above-mentioned case, because SS is independent of the input dimension and LSS complexity.

#### 5.2.3 Parameter selection

This chapter focuses on presenting the proposed framework and showing its application to a dam problem. The soil variability modeling is not explained in detail. How to properly describe the soil uncertainties by using a limited number of measurements is also an important element for geotechnical probabilistic analysis in practice. Some studies on this topic can be found in [10, 11]. In this chapter, the effects of two parameters (

#### 5.2.4 Extension of the proposed framework

The illustrative example in this chapter is based on the stability problem of a homogeneous embankment dam. The proposed framework can also be easily extended to perform the probabilistic analysis of other problems in dams engineering (rapid drawdown, erosion and settlements) by using an appropriate deterministic model and well determining the input uncertainties. Then, the proposed two stages of RV and RF can be conducted in a hierarchical way. For embankment dams with an earth core or multiple soil layers, the uncertainties should be separately modeled for each zone using different RVs or RFs [17]. It is also important to consider the correlation between the variable properties of different zones by analyzing the available measurements. In case of not enough data, a parametric analysis is recommended in order to have an idea of the unknown correlation structure effect. As embankment dams are artificial rock-filled or earth-filled structures constructed with a careful control, uncertainties at the zone boundaries can be considered as negligible. In natural soils, where stratigraphic boundary uncertainties are expected to exist, the related effects will be noticeable and should be considered.

## 6. Conclusion

This chapter introduces a framework for the probabilistic analysis of embankment dams. The proposed methodology can also be used for other geotechnical works. The RV and RF approaches are both considered in the framework, corresponding two probabilistic analysis stages. In the first stage, the RV approach is used within three probabilistic techniques (MCS, FORM, and GSA) in order to efficiently provide multiple results which could be beneficial for evaluating a design and guide a further site investigation or a further analysis. The second stage introduces RFs for the purpose of accounting for the soil spatial variability and giving a more precise