Flooding Fragility Model Development Using Bayesian Regression

Alison Wells; Chad L. Pope

doi:10.5772/intechopen.99556

Abstract

Traditional component pass/fail design analysis and testing protocol drives excessively conservative operating limits and setpoints as well as unnecessarily large margins of safety. Component performance testing coupled with failure probability model development can support selection of more flexible operating limits and setpoints as well as softening defense-in-depth elements. This chapter discuses the process of Bayesian regression fragility model development using Markov Chain Monte Carlo methods and model checking protocol using three types of Bayesian p-values. The chapter also discusses application of the model development and testing techniques through component flooding performance experiments associated with industrial steel doors being subjected to a rising water scenario. These component tests yield the necessary data for fragility model development while providing insight into development of testing protocol that will yield meaningful data for fragility model development. Finally, the chapter discusses development and selection of a fragility model for industrial steel door performance when subjected to a water-rising scenario.

Keywords

fragility model development
Bayesian regression
Markov Chain Monte Carlo
fragility model checking
Bayesian p-value

Author Information

Show +

Alison Wells
- Idaho National Laboratory, USA
Chad L. Pope*
- Idaho State University, USA

*Address all correspondence to: popechad@isu.edu

1. Introduction

Traditional component pass/fail design analysis and testing protocol drives excessively conservative operating limits and setpoints as well as unnecessarily large margins of safety. Additionally, pass/fail testing tends to result in data shortcomings which must then be addressed using defense-in-depth elements. Contrarily, component performance testing and failure probability model development can support selection of more flexible operating limits and setpoints as well as softening defense-in-depth elements. The two major obstacles involved in developing a failure probability model, also known as a fragility model, center on devising an optimum component performance testing protocol so that meaningful data can be collected, and navigating the process of developing and testing an appropriate fragility model.

This chapter will first discuss the process of Bayesian regression fragility model development which includes model checking protocol. The foundation of fragility model development is Bayesian in nature where both data and parameters have probability distributions, and we seek a model that establishes a relationship between parameters and observables ultimately yielding a posterior probability distribution. That is, the Bayesian method requires an aleatory model, a prior distribution for the parameters of the aleatory model, and data associated with the aleatory model. Then, using Bayes Theorem, the posterior distribution for the model output can be obtained using Markov Chain Monte Carlo (MCMC) methods to address complicated integration. Multiple models are then developed, and a rigorous process is used to check model validity to help identify the most appropriate model. The model checking and comparison process uses multiple techniques including three types of Bayesian p-values.

With a firm foundation for fragility model development, checking, and selection established, the chapter then discusses component flooding performance experiments associated with industrial steel doors subjected to a rising water scenario. These component tests yield the necessary data for fragility model development while providing insight into development of testing protocol that will yield meaningful data for fragility model development. Finally, the chapter discusses the development and selection of a fragility model for industrial steel door performance when subjected to a rising water flood scenario.

2. Bayesian data analysis

Significant experience exists with fragility modeling focused on seismic fragility model determination. In a seismic fragility model, the single vertical ground acceleration variable is used to completely characterize the failure probability of structures or components of interest. However, other observable parameters may be important indicators for the potential of failure. Expanding upon the seismic example, these observables could include the detailed characteristics of the earthquake such as X, Y, and Z components of the ground motion; frequency of the waves; the age of the component; the anchorage of the component; the specifics of the component type; or any combination of the above.

Limitations found in these traditional fragility models include simplistic (single “driving” parameter) and excessive conservatism. For complex flooding fragility modeling requiring more observables, these issues will be avoided by moving to a more flexible, data-informed approach—Bayesian fragility modeling through phenomena-driven regression modeling. As stated by Box and Tiao, “Bayesian inference alone seems to offer the possibility of sufficient flexibility to allow reaction to scientific complexity free from impediment from purely technical limitation.” [1].

From the Bayesian perspective, both data and parameters can have probability distributions, and the task of Bayesian analysis is to build a model for the relationship between parameters (θ) and observables (y), and then calculate the posterior probability. The Bayesian method, therefore, relies on three items: an aleatory model, a prior distribution for the parameter(s) of the aleatory model, and data associated with the aleatory model. An aleatory model pertains to stochastic or non-deterministic events, the outcome of which is described using probability. The posterior distribution for the model output function is developed in accordance with Bayes’ Theorem [2], which is generally written as:

pθy=pθpyθpyE1

where,

pθy:Posterior distribution, which is conditional upon data (y) that is known related to the hypothesis (θ);
pθ:Prior distribution, for knowledge of the hypothesis (θ) that is independent of data (y);
pyθ:Likelihood, or aleatory model, representing the process or mechanism that provides data (y);
py:Marginal distribution, which serves as a normalization constant.

In summary, the above equation takes our prior knowledge about the parameters and updates this knowledge with the likelihood to observe the data for particular parameter values and gives the posterior probability. It essentially states: posterior∝prior×likelihood

This process combines everything that is known about a particular data set and model response to produce a posterior estimate of the output function’s probability distribution.

Integration of functions plays an important role in Bayesian statistical analysis; however, explicit evaluation of these integrals is only possible for a limited number of special cases. Usually, problems will involve complex distributions and explicit evaluation is not possible. Traditionally, statisticians would be forced to use numerical integration or analytical approximation techniques. However, there are now several powerful software programs that exist for Bayesian inference. One of the most widely used by statistical practitioners is the BUGS (Bayesian inference Using Gibbs Sampling) family of programs. The most popular packages from the BUGS family are WinBUGS and OpenBUGS. There are several methods devised for construction and sampling complex Bayesian posterior distributions. BUGS software utilizes MCMC methods to determine the posterior [3].

MCMC is a general method based on randomly sampling values from a prior distribution to approximate the posterior distribution pθy. The sampling is done sequentially, with the distribution of the sampled parameter depending on the value from the previous step only, forming a Markov chain [4]. Eventually the Markov chain will converge to a unique stationary distribution, the posterior distribution. Therefore, the key to MCMC method is the approximate distributions are improved at each step in the simulation, and after running the simulation long enough, converging to the posterior distribution.

3. Model checking and comparison

After constructing a probability model and computing posterior distributions for all estimated parameters, the next step of a Bayesian analysis includes checking that the model adequately represents the data and is plausible for the purpose for which the model will be used. There are multiple ways of assessing a model’s performance. The approach selected is posterior predictive checking, a useful direct way of assessing the fit of a model to various aspects of the data. Additionally, residual tests are used for informal model criticism and outlier identification.

Posterior predictive checks are a primary form of Bayesian model checking used to assess the fit of the model to various aspects of the data. The procedure is based upon the following assumption: if a given model fits, then data simulated or replicated under the model should be comparable to the real-world observed data the model was fitted to [4]. In other words, the observed data should be plausible under the posterior predictive distribution. If any systematic differences occur between simulations and the data, it potentially indicates that model assumptions are not being met.

The model is checked for deviations from an assumed parameter form by means of test quantities or discrepancy functions, Tyθ, that depend on both data (y) and parameters (θ). A check is made whether Tyθ is compatible with the simulated distribution of Tysimulatedθ by calculating a Bayesian p-value [4]. Regarding the choice of discrepancy functions, focus is given to diagnosing global lack of fit rather than discovering outliers; a task given to residual calculations. A summary of candidate discrepancy functions considered is provided in Table 1. Note, to avoid numerical errors for binomial models if p = 0 or 1, a small ε = 0.00001 is added in the expressions.

Name	Definition	Binomial Expression
χ2	Tyθ=∑iyi−Eyiθ2Varyiθ	∑iyi−npi2npi1−pi+ε
Likelihood ratio	Tyθ=2∑iyilogyiEyiθ	2∑iyilogyi+εnpi+ε
Freeman-Tukey	Tyθ=∑iyi−Eyiθ2	∑iyi−npi2

Table 1.

Discrepancy functions used for model checking [5].

Note that ideally model checking should be based on new data, although in practice the same data is generally used for both developing and checking the model. This means Bayesian p-values based on these checks tend to be conservative [3]. However, this does not imply that posterior predictive checks lack value. Given that tests are conservative, small (less than 0.05) and large (greater than 0.95) p-values strongly suggest lack of fit. P-values closest to 0.5 indicate a high degree of predictive capability [2]. The concept of Bayesian p-value is graphically represented in Figure 1.

Figure 1.
Depiction of the Bayesian p-value predictability.

Residuals measure the discrepancy between the observed data and an assumed model. Informal tests based on Pearson and deviance residuals can be used to identify obvious assumption violations. Note that these analyses are generally carried out informally in Bayesian application, since all residuals depend on θ and have posterior distributions [6]. Therefore, they are not truly independent as required in unbiased application of goodness-of-fit tests.

A standardized Pearson residual is defined as:

ri=yi−EyiθVaryiθE2

Where is the Eyiθ expected value and Varyiθ is the variance. Since it is considered a function of random yi for a fixed θ, Pearson residuals should generally take on values between −2.0 and 2.0 [6]. Values falling outside this range would represent outliers.

Residuals can also be based on a saturated version of the deviance., defined as:

Dsθ=−2logpyθ+2logpyθ̂syE3

where θ̂sy are the saturated estimates. Models for which saturated deviance is appropriate, such as Poisson and binomial, the rule of thumb for a rough assessment of the fit is the mean saturated deviance should approximately equal sample size n [3].

Following model checking, comparisons can be made on the performance of alternative hypothesized models. It is not an uncommon occurrence for more than one probability model to provide an adequate fit to the data. These models may differ in prior specification, link function selection, or which explanatory variables are included in the regression, to name a few. Therefore, an analysis should not only examine models to see how they fail to fit reality but compare how sensitive the resulting posterior distributions are to arbitrary specifications using any number of model comparison or performance metrics.

There are a variety of Bayesian model comparison methods, including methods based on information criteria, which are measures of the relative fit. Deviance Information Criteria (DIC) is a measure of model fit that can be applied to Bayesian models and is applicable when the parameter estimation is done using techniques such as Gibbs sampling. It is particularly useful in Bayesian model selection problems where the posterior distributions of the model have been obtained by MCMC simulation. DIC is a generally straightforward computation, and no additional scripting is needed to calculate it in OpenBUGS, making it the comparison approach selected for this work.

As a rule of thumb, the model with the smallest DIC usually indicates the better fitting model. Note, however, only differences between models in DIC are important, not strictly absolute values. While it is not easy to define what constitutes an important difference, the following rough guide can be used for DIC comparison [3]:

Differences greater than 10 can be used to rule out the model with the higher DIC.
Differences between 5 and 10 are substantial.
Differences less than 5, there is uncertainty about choice of model. Other methods may need to be considered, especially if models make different inferences.

Note that these considerations include negative values for the DIC, which occur in cases where the deviance is negative. It must also be noted that since DIC is a measure of relative fit, a model with the smallest DIC can still be a poor fit for the data [2].

4. Experiments

The objectives of component flooding experiments are to test individual component performance in flooding scenarios and acquire the necessary data to develop component fragility mathematical models. To conduct rising water experiments, the Portal Evaluation Tank (PET) was designed and built to facilitate testing.

The PET is a steel semi-cylindrical tank with a height and diameter of 8 ft. Its design includes a 62.4 ft² opening for installation of components to be tested, a front water tray with a 90-degree v-notch weir and the ability to hold up to 2,000-gal of water. The PET is connected through 12 in. PVC pipes to a 60 HP pump, which is located inside an 8,000 gal water reservoir, to support variable inlet flow rates up to ∼4,500 gpm. Additionally, the design of PET, once filled, can rely on the pump and pressure and air relief values to provide hydrostatic head to simulate depths up to 20 ft. The PET, along with piping, is shown in Figure 2.

Accompanying instrumentation and measurements included electromagnetic flowmeters for upstream and downstream flow rates and two pressure transducers for averaged water depths and temperature. The PET can also measure small leakage rates that do not exceed the v-notch weir barrier using an ultrasonic depth sensor. The top of the PET is also equipped with pressure and air relief valves and a digital pressure gauge to measure pressures for simulated hydrostatic head once the PET is filled.

The components tested were industrial steel doors oriented to swing outwards, away from the tank interior. A strengthened wall was built to support the doorframe, ensuring stability. The aim of these experiments was to test the door to failure only and not the supporting wall structure. The experimental approach subjected each steel door to a water rising scenario until catastrophic failure of the door occurred or the leakage rate equalized with the filling rate. A compiled summary of the steel door results, including non-failure tests, are given in Table 2.

Test	Depth (in.)	Flow rate (gal/min)	Temperature (F)	Notes
1S	46.1	1148	67.4
2S	39.0	1130	63.3
3S	37.1	1120	63.1
4S	37.8	979	63.0
5S	37.5	1133	63.0
6S	37.6	604	63.0
7S	37.7	593	63.0
8S	37.1	598	63.1
12S	44.5	975	64.0
—	25.7	248	61.6	Non-Failure
—	17.0	117	59.0	Non-Failure
—	27.4	285	59.3	Non-Failure
—	30.9	397	59.4	Non-Failure
—	32.3	484	59.6	Non-Failure
—	24.3	247	60.2	Non-Failure
—	34.8	593	60.7	Non-Failure
—	37.5	696	61.0	Non-Failure
—	38.0	734	61.2	Non-Failure
13S	41.4	1025	61.3

Table 2.

Steel door performance results [5].

5. Model development

Having conducted the flooding experiments and collected observational data on door failures, models where developed that analyzed the fragility of components using explanatory variables. An explanatory variable is a type of independent variable that is possibly predictive of a component’s fragility in a regression analysis. For the probability of door failure during a flooding event, water depth, flow rate, and temperature may be leading indicators of failure and information about these explanatory variables is incorporated into the Bayesian inference.

The mathematical modeling uses the discrete binomial distribution to represent failure of a door installed in the PET during a rising water flood event. This is a commonly used model for failure on demand with key parameters p, the probability of failure on demand, and trials n = 1 (only a single door is potentially challenged during testing). The fragility model in this case looked at seven possibilities: each of the variables alone driving the model to failure, a combination of two variables driving the model to failure, and all three variables driving the model to failure. The above cases are modeled as:

Logitp=intercept+aDE4a

Logitp=intercept+bFE4b

Logitp=intercept+cTE4c

Logitp=intercept+aD+bFE4d

Logitp=intercept+aD+cTE4e

Logitp=intercept+bF+cTE4f

Logitp=intercept+aD+bF+cTE4g

where a, b, and c are the coefficients of the covariate parameters represented as D, F, and T for depth, flow rate, and temperature respectively. Since parameter p represents a probability, it must be constrained between 0 and 1 with a link function. The logit function was selected, which is defined as:

Logitp=lnp1−pE5

While the logit function should transform the parameter p onto an appropriate scale, in practice this was not always true from the special case of n = 1. Periodically, the sampler from the prior distribution selects illogical or extreme values. This can cause errors such as numerical overflow or, within the logistic regression, results in negative parameter values that cannot be log transformed. The improper value prompted a binomial calculation that OpenBUGS is unable to perform, causing the run to crash. It should also be noted that subtle differences in programs could resolve some of these problems. Not all available programs, for instance, use the same sampling approach. A similar model setup in R or JAGS could run without additional considerations for the case of n = 1.

A robust solution focuses on the parameter that fails to meet specifications. The binomial probability of failure, p, needs to take on values between 0 and 1 for OpenBUGS to perform the calculation, as referenced earlier. This requirement can be achieved by restricting p using built-in scalar functions, max and min. They are defined and operate as follows:

max(e1, e2) e1 if e1 > e2; e2 otherwise,
min(e1, e2) e1 if e1 < e2; e2 otherwise.

For the probability of failure to be properly scaled, the following criteria need to hold true:

return 1 if probability of failure is greater than 1,
return 0 if probability of failure is less than 0,
otherwise p.

The quantity p.bound[i] - > max(0, min(1, p[i])) performs all three listed criteria. Inserting p.bound into the model script restricts the probability to lie between 0 and 1 and prevents OpenBUGS from crashing [7]. A logistic link function can now be used when n = 1 for all regression models.

The water temperature data was included as an explanatory variable with the expectation that it would be eliminated as part of the Bayesian analysis. To address the possibility of temperature as a failure influence, centering was used on the covariates. Interpreting coefficients in models with interactions can be simplified by subtracting the mean, x¯=N−1∑xi, of each input variable xi. For example, the temperature T in Eq. (4c) would be subtracted by T¯ and the following logistic regression would be fit:

Logitp=intercept+cT−T¯E6

where the data is now centered at zero. The main effects of using explanatory variables are now interpretable based on comparison to the mean of the data. Coefficients that stay relatively the same compared to the un-centered results indicate low predictability, while large predictive differences are leading indicators of component failure.

Looking at the steel door data, however, leads to a different discovery. Table 3 gives the results for the standard models, and Table 4 gives the results when centering is applied. Depth’s predictive difference is greater than flow rate, but the highest is temperature. Additionally, temperature has the smallest DIC between the three models.

Model	Mean	Standard Dev.	97.5% Interval	DIC
Depth	1.66	0.91	(0.42, 3.89)	13.86
Flow Rate	0.013	0.006	(0.004, 0.028)	16.0
Temperature	2.56	0.88	(1.10, 4.51)	8.294

Table 3.

Coefficient results for standard logit regression model for steel doors [5].

Model	Mean	Standard Dev.	97.5% Interval	DIC
Depth	2.05	1.26	(0.46, 5.21)	14.39
Flow Rate	0.013	0.006	(0.005, 0.028)	15.98
Temperature	7.85	4.69	(2.04, 19.74)	8.98

Table 4.

Coefficient results for centered logit regression models for steel doors [5].

To understand why temperature appears to be the leading indicator of failure, the steel door data, along with its collection process, must be examined. Of the nineteen test results recorded in Table 2, the first nine tests all resulted in door failures. These nine tests were conducted exclusively during the spring. The remainder of the tests, nine non-failure and one failure, where conducted in a single day during the winter when the reservoir water was cooler. The results could mean that warmer water temperatures cause steel doors to fail in flooding events, implying a correlation of variables observed together. It is noted, however, that correlation does not necessarily mean causation. The relationship could have alternative explanations, such as a third-cause fallacy, where a spurious correlation is mistaken for causation. A spurious correlation is a relationship in which events or variables are associated, but not causally related, due to the presence of a third factor [8]. Seasonal weather changing the interior temperature of the laboratory is a hidden third factor. Therefore, steel door flooding failure and water temperature may be correlated with each other only because they are correlated with the weather when testing was conducted. By conducting all non-failure tests in the cooler winter conditions and majority of failures in the warmer spring, an unintentional bias was introduced into the temperature data. This bias, that temperature impacts failure, becomes apparent when looking at the centering comparison.

There is another means of verifying the introduced bias in temperature by looking at the residuals. Pearson residuals should take on values between −2.0 and 2.0. Any data point with values outside this range represent an outlier. If there is a bias introduced from when the tests were conducted, the last data point, a failure during winter testing, should be considered an outlier. Figure 3 shows the residual box plot for the temperature regression model. Note that the last data point has an outlier residual value of 3.53 ± 6.037, confirming the bias.

Figure 3.
Box plot of the temperature regression model residuals using steel door data [5].

Since the steel door temperature data is biased, it is dropped from consideration as an explanatory variable for now. In experiments, controlling and extensively testing the relationship between dependent and independent variables can identify spurious correlation. For component flooding experiments, steps could be taken to control the temperature of the reservoir water. If future testing corrects for this bias, temperature data could again be considered as part of the Bayesian analysis for steel doors. Of the remaining depth and flow rate data, centering simplified interpreting coefficients and indicated depth as a significant indicator of failure.

Development of the logistic regression models so far has been directly interpreting the failure response given some predictor(s) data. It is also possible to interpret indirectly by incorporating an additional random variability. These models assume that besides the observed variables, there could be an unobserved variable or random effects. Therefore, the probability of the binomial distribution is allowed to adjust by some small amount, λi, for each observation.

A script was written where logistic regression equations contain a random or latent effect. In the case of the depth model, previously given by Eq. (4a), it would now be defined as follows:

Logitp=intercept+aD+λiE7

with λi∼N0σ2 and unknown variance. A prior distribution is specified for σ. More variability is accounted for by allowing the probability to vary on an observation-by-observation bases.

The resulting p-values and DIC for the depth, flow rate, and combined regression models are given in Table 5. The larger p-values (all greater than 0.95) strongly suggest lack of fit. The regression models without variability are favorable over the inclusion of unobserved effects for their better fit.

Model	χ2	Likelihood Ratio	Freeman-Tukey	DIC
Depth	0.97	0.97	0.97	0.41
Flow Rate	0.99	0.99	0.99	0.13
Depth. Flow Rate	0.99	0.99	0.99	0.08

Table 5.

Depth, flow and combined p-values and DIC [5].

The final OpenBUGS script for the depth regression model, prior distributions, and dispersed initial values is shown in Table 6. Included are the script for the three Bayesian p-value calculations and the saturated deviance.

#Bound Binomial Model using Logit Regression: Final
#Steel Door Data

model{

for(i in 1:tests){
failure[i] ∼ dbin(p.bound[i], numtested)
p.bound[i] < − max(0, min(1, p[i]))

#Regression Model
logit(p[i]) < − int. + depth*WDepth[i]

failure.rep[i] ∼ dbin(p.bound[i], numtested)

#Fit Assessment: Pearson Residuals Posterier Predective check (Bayesian P-Value)
residual[i] < − (failure[i] - (numtested*p.bound[i]))/sqrt(numtested*p.bound[i]*(1-p.bound[i]) + 0.00001)
residual.rep[i] < − (failure.rep[i] - (numtested*p.bound[i]))/sqrt(numtested*p.bound[i]*(1-p.bound[i]) + 0.00001)
sq.[i] < − pow(residual[i], 2)
sq.rep[i] < − pow(residual.rep[i], 2)

#Fit Assessment: Likelihood Statistic Posterier Predective check (Bayesian P-Value)
like.obs[i] < − failure[i]*log((failure[i] + 0.00001)/(numtested*p.bound[i] + 0.00001))
like.rep[i] < − failure.rep[i]*log((failure.rep[i] + 0.00001)/(numtested*p.bound[i] + 0.00001))

#Fit Assessment: Freeman-Tukey Statistic Posterier Predective check (Bayesian P-Value)
diff.obs[i] < − pow(sqrt(failure[i]) - sqrt(numtested*p.bound[i]), 2)
diff.rep[i] < − pow(sqrt(failure.rep[i]) - sqrt(numtested*p.bound[i]), 2)

prop[i] < − failure[i]/numtested
Ds[i] < − 2*numtested*(prop[i]*log((prop[i] + 0.00001)/(p.bound[i] + 0.00001))
+ (1-prop[i])*log((1-prop[i] + 0.00001)/((1-p.bound[i]) + 0.00001)))

phat[i] < − failure[i]/numtested

}

chisq.obs < − sum(sq[])
chisq.rep <− sum(sq.rep[])
p.chisq <− step(chisq.rep - chisq.obs)

likelihood.obs < − sum(like.obs[])
likelihood.rep <− sum(like.rep[])
p.likelihood <− step(likelihood.rep - likelihood.obs)

freeman.obs < − sum(diff.obs[])
freeman.rep <− sum(diff.rep[])
p.freeman <− step(freeman.rep - freeman.obs)

dev.sat < − sum(Ds[])

#Prior Distributions
int. ∼ dnorm(0, .000001)
depth ∼ dnorm(0, .000001)
}

data
list(
tests = 19,
numtested = 1,
failure = c(1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1),
WDepth = c(46.1, 39.0, 37.1, 37.8, 37.5, 37.6, 37.7, 37.1, 44.5, 25.7, 17.0, 27.4, 30.9, 32.3, 24.3, 34.8, 37.5, 38.0, 41.4),
WFlow = c(1148, 1130, 1120, 979, 1133, 604, 593, 598, 975, 248, 117, 285, 397, 484, 247, 593, 696, 734, 1025)
)

inits

#Depth
list(int = −28, depth = 4, flow = 0, temp = 0, failure.rep = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0))
list(int = −122, depth = 0, flow = 0, temp = 0, failure.rep = c(0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0))

Table 6.

OpenBUGS script [5].

The mean values calculated for the applicable parameters in the outward swinging steel door fragility models and corresponding Bayesian p-values are shown in Table 7. The saturated deviance for all three models compared with the data sample size suggests that all three models fit adequately. The DIC is nearly the same for all three models, the smallest belonging to the depth model by a non-significant amount. The model with only depth as an explanatory variable has the closest Bayesian p-value using the likelihood ratio (0.38). It also has the slightly closer average p-value compared to 0.5 than the regression model with only flow rate and the combined model with both variables. Given the results, the model with only depth is recommended for predictive analyses.

Parameter	Depth	Flow Rate	Depth, Flow Rate
intercept	75.68	−8.51	−72.5
a (depth coeff.)	2.05	—	1.83
b (flow rate coeff.)	—	0.01	0.007
Sat. deviance	12.88	14.29	13.31
Chi-squared	0.19	0.26	0.14
Likelihood ratio	0.38	0.36	0.29
Freeman-Tukey	0.33	0.23	0.21

Table 7.

Summary posterior estimates of logistic regression parameters and Bayesian p-values using steel door data [5].

With depth selected as the explanatory variable regression model, the parameters in Table 7 are used with the fragility model to calculate the failure probability for a steel door as a function of water depth. The probability p is given by:

p=1e−−75.68+2.05x+1E8

where x is the given water depth. Figure 4 shows the plot of failure probability versus water depth with 95% credible intervals. It should be noted that the mean, shown in red, is close to the bound at low probabilities. This is due to a couple of non-failure tests reaching water depths greater than some observed failure depths, bringing the mean near the credible interval at low fragility probabilities.

Figure 4.
Fragility curve showing probability of failure versus water depth. Blue curves represent the 95% credible intervals [5].

6. Conclusion

Component failure probability models provide a pathway for selection of more flexible operating limits and setpoints. Model development requires component performance data and an effective process for probability model selection and checking. Using Bayesian methodology, prior knowledge about model parameters can be updated with the knowledge of the likelihood to observe data for parameter values giving a posterior probability. In short, the process combines everything that is known about a particular data set and model response to produce a posterior estimate of the output function’s probability distribution. Integration of these functions is necessary and can be accomplished through MCMC methods.

Bayesian model checking is used to assess the fit of the model to various aspects of the data using the assumption that if a given model fits, then data simulated or replicated under the model should be comparable to the real-world observed data. If any systematic differences occur between simulations and the data, it potentially indicates that model assumptions are not being met. The model is also checked for deviations by means of test quantities or discrepancy functions that depend on both data and parameters by calculating a Bayesian p-value. The DIC can also be used as a measure of model fit that can be applied to Bayesian models and is applicable when the parameter estimation is done using techniques such as Gibbs sampling. It is particularly useful in Bayesian model selection problems where the posterior distributions of the model have been obtained by MCMC simulation.

Application of the data collection, model development, and model checking process was carried out for the performance of steel doors subjected to water rise flooding conditions. The resulting fragility model provides a carefully developed representation of the failure probability as the flood depth changes. The model can then be used in more comprehensive probabilistic flooding analyses rather than simply using an empirically derived pass-fail water depth for steel doors subjected to water rise flooding scenarios. The overall result of using the rigorously developed fragility model is a more robust representation of how components will perform when subjected to challenges such as flooding. With an improved representation of overall performance available, necessary limits and controls can then be selected without undue conservatism.

Acknowledgments

Funding support for the PET construction and experiments and fragility model development was provided to Idaho State University by the US Department of Energy Light Water Reactor Sustainability Program through Contract Number 154652.

References

1. Box G, Tiao G. Bayesian Inference in Statistical Analysis. John Wiley & Sons Wiley, 1992
2. Kelly D, Smith C. Bayesian Inference for Probabilistic Risk Assessment: A Practitioner’s Guidebook. Springer, 2011
3. Lunn D, Jackson C, Best N, Thomas A, Spiegelhalter D. The BUGS Book: A Practical Introduction to Bayesian Analysis: CRC Press, 2013
4. Gelman A, Carlin J, Stern H, Dunson D, Vehtari A, Rubin D. Bayesian Data Analysis. Texts in Statistical Science. CRC Press, 3rd edition, 2014
5. Wells A. 2020. Assessing Nuclear Power Plant Component Fragility in Flooding Events Using Bayesian Regression Modeling with Explanatory Variables [Doctoral Dissertation]. Pocatello: Idaho State University
6. Conn P, Johnson D, Williams P, Melin S, Hooten M. A guide to Bayesian model checking for ecologists. Ecological Monographs, 88(4):526–542, 2018
7. Gelman A, Hill J. Data Analysis Using Regression and Muiltilevel/Hierarchical Models. Cambridge University Press, 2007
8. Burns W. Spurious Correlations, 1997

[1] 1. Box G, Tiao G. Bayesian Inference in Statistical Analysis. John Wiley & Sons Wiley, 1992

[2] 2. Kelly D, Smith C. Bayesian Inference for Probabilistic Risk Assessment: A Practitioner’s Guidebook. Springer, 2011

[3] 3. Lunn D, Jackson C, Best N, Thomas A, Spiegelhalter D. The BUGS Book: A Practical Introduction to Bayesian Analysis: CRC Press, 2013

[4] 4. Gelman A, Carlin J, Stern H, Dunson D, Vehtari A, Rubin D. Bayesian Data Analysis. Texts in Statistical Science. CRC Press, 3rd edition, 2014

[5] 5. Wells A. 2020. Assessing Nuclear Power Plant Component Fragility in Flooding Events Using Bayesian Regression Modeling with Explanatory Variables [Doctoral Dissertation]. Pocatello: Idaho State University

[6] 6. Conn P, Johnson D, Williams P, Melin S, Hooten M. A guide to Bayesian model checking for ecologists. Ecological Monographs, 88(4):526–542, 2018

[7] 7. Gelman A, Hill J. Data Analysis Using Regression and Muiltilevel/Hierarchical Models. Cambridge University Press, 2007

[8] 8. Burns W. Spurious Correlations, 1997

Flooding Fragility Model Development Using Bayesian Regression

The Monte Carlo Methods - Recent Advances, New Perspectives and Applications

Abstract

Keywords

Author Information

Alison Wells

Chad L. Pope*

1. Introduction

2. Bayesian data analysis

3. Model checking and comparison

Table 1.

Figure 1.

4. Experiments

Figure 2.

Table 2.

5. Model development

Table 3.

Table 4.

Figure 3.

Table 5.

Table 6.

Table 7.

Figure 4.

6. Conclusion

Acknowledgments

References

Markov Chain Monte Carlo in a Dynamical System of Information Theoretic Particles

Flooding Fragility Model Development Using Bayesian Regression

The Monte Carlo Methods - Recent Advances, New Perspectives and Applications

Abstract

Keywords

Author Information

Alison Wells

Chad L. Pope*

1. Introduction

2. Bayesian data analysis

3. Model checking and comparison

Table 1.

Figure 1.

4. Experiments

Figure 2.

Table 2.

5. Model development

Table 3.

Table 4.

Figure 3.

Table 5.

Table 6.

Table 7.

Figure 4.

6. Conclusion

Acknowledgments

References

Continue reading from the same book

The Monte Carlo Methods