## 1. Introduction

Computational fluid dynamics (CFD) simulations are affected by uncertainties, due in large part to the increasing complexity of physical models and the inherent introduction of random model data. The identification of the main uncertain parameters and the evaluation of their influence on the simulation results are crucial steps in estimating the simulation accuracy. For turbulent flows, not only that the uncertainty in the choice of the turbulent model parameters has to be considered, but also the physical properties of the fluid and the experimentally provided boundary conditions can be largely affected by uncertainties. The choice of single values for these “input” parameters often leads to considerable differences between the simulation and experimental results. The sensitivity of the simulation results to the range of different combinations of input parameters can be evaluated by hundreds of CFD simulations. However, even a simple steady-state simulation of a turbulent flow can take several hours to complete. One way to alleviate this burden is the construction of a surrogate model that mimics the behaviour of the CFD simulation as closely as possible while being computationally much cheaper. Such models are based on a complex mathematical response function that is constructed on a limited number of intelligently selected data (CFD) points. Several methods that deal with the sensitivity or uncertainty of CFD computations can be found in the literature [1, 2, 3, 4].

The uncertain parameters affecting the turbulent flow presented in this study are investigated on a typical CFD case. The benchmark experiment featuring the mixing of two turbulent flows performed at the Paul Scherrer Institute (PSI) [4] has been used in the analysis. To analyse the effect of the sensitivity parameters, a series of CFD simulations is performed with the code NEPTUNE_CFD that solves the Reynolds Averaged Navier Stokes (RANS) equations with the k-eps turbulence model [5]. The optimal statistical estimator (OSE) method [6] is used for the response surface generation.

## 2. Turbulent mixing benchmark

To investigate the main turbulent mixing mechanisms, the experimental data of the water mixing experiment Generic Mixing Experiment (GEMIX) were used [4]. The GEMIX facility, located at the Paul Scherrer Institute (PSI), features the horizontal test section where the two horizontal water streams with different temperatures or densities can be mixed.

The selected benchmark case considers the mixing of two isothermal flows with the same density and mass flow rate. The schematic representation of the test section is shown in **Figure 1**. The inflow section consists of two slightly inclined rectangular channels separated by the splitter plate that merge into the larger square channel with the cross section of 50 mm by 50 mm. The water with a mass flow rate of 1 kg/s flows through the upper and lower channel into the mixing section, where the two water streams mix together to form a turbulent mixing layer. The profiles of velocity and turbulent quantities were measured at two axial locations, 70 mm and 450 mm behind the splitter plate, as shown in **Figure 1**. A detailed description of the experiment and the measurement techniques are given in [7].

### 2.1. Uncertain parameters at the inlet

The complete information about the flow conditions in the experiment requires the measurement of velocity and turbulent quantities at several locations along the flow. In the considered experiment, these data are provided in the mixing section, but the velocity and turbulence profiles have not been measured in the region of the two inflow channels. Trying to impose a uniform flow with the low level of turbulence, a flow straightener followed by several grids of different densities was placed inside each of the inflow channels [4]. Nevertheless, at the known mass flow rate, the velocity profile and turbulence at the inlet can be treated as uncertain parameters with an estimated range of values.

One may assume that the profile inlet velocity can take the values between the uniform and the fully developed turbulent profile shape. This can be written in a parametrised form using the parameter α as

where *U _{d}*(

*y*,

*z*) stands for the fully developed velocity profile and

*U*for the uniform velocity. The fully developed velocity profile was obtained from a separate simulation in a single long rectangular channel of the same cross section and was imposed for both inlet flows. For the purpose of the implementation into the CFD code, the velocity profiles between the uniform and fully developed profile shape were approximated by trigonometric mathematical functions that are exactly zero at the wall. The product of two series of cosine functions (Fourier series) is used:

_{u}where *a _{n}* and

*b*are coefficients obtained with the least-squares method and

_{n}*L*and

_{y}*L*represent the dimensions of the channel. The parameter

_{z}*α*in Eq. (1) is varied between 0 and 1 (0 represents uniform and 1 fully developed axial velocity profile) in steps of 0.2. The second uncertain parameter is the turbulence intensity

*β*at the inlet, which describes the ratio between the turbulent velocity fluctuations and the mean velocity. In the present work, it is assumed that the turbulence intensity

*β*may vary between 0% and the high turbulence level of 25%.

## 3. CFD simulations

The simulations were carried out with the NEPTUNE_CFD 2.0.1 code [8] developed by Electricite de France (EDF) and Commissariat a l’Energie Atomique (CEA, France). NEPTUNE_CFD 2.0.1 is a dedicated code for modelling of transient flows in nuclear reactor systems. Phenomena that involve mixing of turbulent flows are of high interest in nuclear industry, especially from the perspective of hypothetical accident conditions [5].

The momentum and turbulence transport equations are solved using the Reynolds Averaged Navier Stokes (RANS) approach. The standard k-ε model with the logarithmic wall function near the walls was used for the turbulence modelling. The calculations were performed on a computational mesh with 475,000 hexahedral elements with mesh refinement near the channel walls and around the splitter plate. The simulation and mesh details are provided in [9].

The mass flow rate is the same at both inlets (1 kg/s), which corresponds to the average liquid velocity of 0.8 m/s. The shape of the inlet velocity profile is set by Eq. (1) and depends on the parameter *α*. The density of the water is 998 kg/m^{3} at the temperature of 23 °C. Besides the velocity, also the turbulence intensity *β* has to be prescribed at the inlet. No-slip boundary conditions are used on the wall, and a constant pressure is set at the outlet of the domain. All simulations were performed at constant turbulent Schmidt number Sc = 0.7. The constant time step of 0.02 s is used for the calculations. About 100 time steps are sufficient to reach the converged solution.

### 3.1. Matrix of CFD simulation points

The relevant output results are the profiles of the turbulence kinetic energy *k* and the downstream velocity component *U* (in *x* direction, see **Figure 1**) at two locations: 0.07 and 0.45 m downstream from the edge of the splitter plates. To evaluate the effect of random values of uncertain parameters on the results, a series of simulations was performed, independently combining different values of *α* and *β*. The matrix of CFD simulation points is presented in **Figure 2**. It can be seen that the calculation points are distributed much densely and in equidistant steps of 1% at lower values of *β* (between 0 and 4%), whereas their distribution is uneven and rather scarce at higher values. Overall 40 simulations have been carried out. But this is not nearly enough to cover the complete space of results for the range of *α* (0–1) and *β* (0–25%) values. In between, the surrogate (or response surface) model can be used.

## 4. Optimal statistical estimator (OSE) method

Hundreds of time-consuming CFD simulations would be needed to cover the entire space of results for different combinations and values of input parameters. Instead, the surrogate model can be used to replace the numerous code simulations. Surrogate modelling is based on the construction of the mathematical response function from the limited number of selected data points aiming to estimate the system response or, in other words, to define the relationship between specific system inputs and outputs. In this way, the constructed response surface can be used for the sensitivity analysis of the impact of uncertain parameters.

In the work of [6], the response surface was generated by the optimal statistical estimator (OSE) method, which can be easily applied for multidimensional space. Basic equations for the response surface generation with OSE are provided in this section; details are described in [6]. The OSE method uses a linear combination of code-calculated output values (or their corrected values as explained later) and coefficients *C _{n}*, which represent the measure of similarity between a given vector of input data

*G*and the vector

*G*for the

_{n}*n*th calculation. The optimal statistical estimator

*C*are defined as

_{n}where *G* = (*x*_{1}, *x*_{2}*,…*, *x _{M}*) is a given input data vector,

*G*

_{n}= (

*x*

_{n1},

*x*

_{n2}, …,

*x*

_{nM}) and

*H*

_{n}= (

*x*

_{n(M+1)},

*x*

_{n(M+2)},…,

*x*

_{nI}) are the input and output data vectors for the

*n*th calculation, respectively;

*M*is the number of input parameters;

*I*the number of input and output parameters; and

*N*is the number of calculated (or measured) values. Since the Dirac delta function

*δ*appears in the denominator of Eq. (4) in the original derivation for the one-dimensional case, an approximation for

*δ*was used to make the formula applicable [10]. Among various possible approximations, the Gaussian function was the most appropriate, and for the

*M*-dimensional case, it was adapted to

where *σ _{i}* is the width of the Gaussian curve determined by the user. The selection of the adequate width

*σ*needs special attention. Widening of the Gaussian curve by selecting a larger width

_{i}*σ*extends the influence area of the particular input data point to its surroundings, i.e. also to the neighbouring data points. To uniformly cover the volume with samples, the width

_{i}*σ*for dimension

_{i}*i*is defined as follows:

where *S _{i}* is the distance between the minimum and maximum value of the data points

*x*(

_{ni}*n*= 1, 2,…,

*N*) and

*N*is the number of intervals between the data points. The corrective factor

_{i}*f*should be selected by the user based on the desired and previously tested performance of OSE. The contribution of each data point to the estimation of the final result can then be adjusted by

_{c}*f*. If

_{c}*f*is equal to 1.0, the width σ

_{c}_{i}is equal to the distance between two adjacent points, and the influence is thus extended approximately to its neighbour points. On the other hand, if

*f*is close to 0, in the one-dimensional case, this would result in a stepwise function. With increasing

_{c}*f*the OSE function becomes smoother between the data points but at the cost of less accurate predictions at the data points, which can be eliminated by appropriately shifting the data point values as performed in the present analysis. OSE is applied correctly if the OSE function between the calculated points is smooth and the calculated points are correctly predicted. More details about how the contribution of each calculated data point to the final output parameter estimation can be adjusted by the corrective factor

_{c}*f*can be found in [6].

_{c}Using the derived optimal statistical estimator

The function *Y(X)* is modelled by OSE. The vector *Y* is influenced by the input parameters, which are directly transferred to the output, while the optimal statistical estimator determines the complementary components *C _{n}*; therefore the response surface for highly nonlinear functions can be efficiently generated.

For assessing the adequacy and predictive capability of OSE, the root mean square error and coefficient of determination for the *m*th input parameter are used:

for *m* = (*M* + 1), (*M* + 2),…, *I*, where *M* is the number of input uncertain parameters and *I* is the number of the input and output parameters. Here *x _{nm}* is the

*n*th code-calculated value of the

*m*th output parameter,

*x*is

_{est,m}*n*th estimated value with the optimal statistical estimator (see Eq. (3)) and

*x*is the mean of the

_{avg,m}*N*code-calculated values of the

*m*th output parameter. The assessment of the predictive capability of the optimal statistical estimator with the two proposed statistics is perfect when

*RMS*= 0 and

_{m}To produce the output results, the values of the input parameters (*x*_{1}*, x*_{2}*,…, x _{M}*) are randomly sampled (e.g. by Monte Carlo method) each time, and then the corresponding unknown output values are estimated by the optimal statistical estimator using Eqs. (3)–(9). Each time the new coefficients

*C*are calculated, while the values of

_{n}*H*are the data points, determined in the phase of the response surface generation by OSE.

_{n}The values are based on computer code calculation values to achieve the desired predictive capability of OSE for the code-calculated values. If the statistics *RMS _{m}* and

*x*initially exceeds the code-calculated value

_{est,m}*x*, the new value is decreased for the difference between the estimated and code-calculated value and vice versa. This is done in iterative steps until the desired accuracy of the estimated value is obtained in the points (i.e. values of input parameters) used in the

_{nm}*N*code calculations. Due to the highly non-linear functions, only a few iterations are normally needed to achieve a reasonable agreement. In this way the code-calculated values can always be matched.

## 5. Results and discussion

### 5.1. CFD simulation results

The results of the NEPTUNE_CFD sensitivity calculations are presented in **Figures 3**–**6**. In **Figure 3** the axial velocity profiles at two locations 0.07 and 0.45 m downstream the splitter plate are shown. The sensitivity to the inlet turbulence intensity *β* is presented for three different values of inlet velocity profiles (α = 0, 0.4 and 1). As shown, high values of *β* tend to smooth the velocity gradients. The smoothing effect at higher *β* values is more pronounced for the profiles with α = 0.4 and 1. It can be seen that the highest turbulent intensities *β* = 16 and 25 significantly flatten the velocity profile farther away from the splitter plate, at 0.45 m. However it is less likely that such high turbulence intensities would appear in this type of flow.

The simulated profiles of turbulence kinetic energy *k* at two downstream locations, 0.07 and 0.45 m, are shown in **Figure 4**. Different curves represent different values of *β* at three inlet velocity profiles α (0, 0.4 and 1). It can be observed that low values of the inlet turbulence intensity (*β* up to 4%) significantly underpredict the experimental values of *k* for both downstream locations. On the other hand, the results for high *β* (10–25%) clearly overestimate the experimental data at the closest measuring location, 0.07 m behind the splitter plate. Farther away, at 0.45 m, the predictions with the highest *β* values match the experiment in the central, mixing part of the flow channel. The influence of *β* is similar for the three α values, except that the simulated *k* profiles tend to be smoother for the uniform velocity inlet (α = 0). In addition, the asymmetry in the measured data can be observed, resulting in different deviations between the measured and simulated turbulent kinetic energy at the lower and upper side of the channel (left and right side of the figures).

The influence of *β* on the turbulent kinetic energy *k* is additionally discussed in **Figures 5** and **6**, where the dependence on α is shown for several different *β* values. **Figure 5** presents the results just behind the splitter plate (at 0.07 m) and **Figure 6** the results farther in the mixing region (at 0.45 m). At 0.07 m, the best agreement between the simulation and experiment is obtained for *β* = 8% (**Figure 5**), whereas further downstream, at 0.45 m (**Figure 6**), the simulations with *β* = 8% underpredict the measured data. At this axial location, the profiles with *β* values of 16 and 25% match the experimental data. This result indicates that the dissipation of turbulence kinetic energy *k* in the existing k-ε turbulence model is higher than in the experiment.

### 5.2. Response surface generated by OSE method

For the reference OSE calculation, all 40 simulation points (see **Figure 2**) are used to generate the response surface (Case C0). It should be noted that for creating the 3D figures of the NEPTUNE_CFD-calculated data, the missing data in the calculational matrix at given *α* were obtained by interpolation in *α* direction as missing data would be otherwise set to 0 on the graph. Also, for the 3D graph to be proportional in *β* direction, the NEPTUNE_CFD-calculated data were linearly interpolated so that the distance between two points was set equal to 1% in *β* direction for comparison purposes with the response surface constructed by OSE.

The CFD-calculated results of the output parameters are available in the interval (−0.0245, 0.0245 m) with step size 0.0005 m. This means 99 discrete points. The results are given at two *x* axis positions: *x* = 0.07 m and 0.45 m. For each discrete point (at given value of *x* and *y*), there is a surface (function of α and *β*), which has to be generated by OSE. Considering two output parameters at two axial locations, this means that four times 99 response surfaces were automatically generated by OSE for each case and only some are shown.

**Figures 7** and **8** show the comparison between the 3D plots of the NEPTUNE_CFD simulation points and the response surfaces generated by the OSE reference case C0. The dependency of the downstream velocity *U* and turbulence kinetic energy *k* on the parameters α and *β*, at different locations in the mixing region, is presented.

Although the construction points in the calculation matrix are scarcely distributed (see **Figure 2**) at higher *β* values, a good agreement between the NEPTUNE_CFD calculation and the OSE response surface can be observed. At the middle of the channel (*y* = 0 m), the velocity *U* increases with increasing *β*, while the dependence on *α* is smaller (see **Figure 7**(a1) and (b1)). The response surface is smooth and in good agreement with the surface plotted from the calculated data points as the relations are rather linear. On the other hand, at location *y* = 0.08 m, the velocity *U* increases with *α* and decreases with *β* as shown in **Figure 7**(a2). More data in *β* direction would further improve the agreement for this more nonlinear relation. Similar is true for the velocity *U* at *y* = 0.16 m. Finally, from **Figure 7**(b4) it can be seen that at *α* = 0 the code-calculated value has not been matched for the highly nonlinear surface. **Figure 8** showing the turbulence kinetic energy shows that the highest values are obtained at the maximum *α* and *β* value. The code-calculated values were matched by OSE (*R*^{2} always above 0.99). It should be noted that the graph plotted from the calculated data has many interpolated data (in one direction only); therefore visually the agreement does not seem to be so perfect.

As already mentioned, the selected number of points (i.e. 40 points) is relatively scarcely distributed across the calculation matrix. Nevertheless, the main strength of the OSE method is its capability to efficiently generate the response surface even with much fewer number of construction points [11], in particular for the two-parameter problems. To demonstrate this, the following cases with reduced number of calculation points are considered for the response surface generation:

Case 1 (C1): 19 calculations (1, 5, 6, 7, 8, 11, 14, 15, 20, 21, 22, 26, 29, 32, 33, 37, 38, 39, 40)

Case 2 (C2): 11 calculations (1, 7, 8, 14, 20, 21, 26, 32, 33, 39, 40)

Case 3 (C3): 9 calculations (1, 7, 8, 20, 21, 26, 33, 39, 40)

Case 4 (C4): 5 calculations (1, 8, 20, 33, 40)

**Figures 9** and **10** show the influence of the number of calculated points on the response surface generation for the downstream velocity *U* and the turbulence kinetic energy *k*, respectively. In the case with 40 calculational points (C0 case) for the response surface generation by OSE, the value of the corrective factor *f _{c}* is set to

*f*= 1.5 and the number of iterations to 90. The distances

_{c}*S*1 and

*S*2 are 1 and 25, respectively, and the number of the intervals

*N*1 and

*N*2 is 5. Finally, in the cases with less than 40 calculation points,

*f*is changed to

_{c}*f*= 2.0 to widen the influence of the neighbouring points. Generally,

_{c}*R*

^{2}was above 0.97. The only exceptions are the points describing

*U*at

*y*values near the walls (less than −0.21 m and higher than 0.21 m), where

*R*

^{2}was around 0.82 (see

**Figure 7**(b4)). Closely observing

**Figure 3**, the velocity

*U*near the walls drops significantly.

**Figure 9** clearly shows that the agreement with the calculated 3D graph decreases with lowering the number of calculated points used for the response surface generation. Nevertheless, the applied simulation points are perfectly predicted in all cases (*R*^{2} = 1 even for case C4 with the lowest number of points). However, if the number of used points is low, the agreement in the region between these points is not perfect.

The comparison of *k* values in **Figure 10** shows that the response surfaces for the cases C0 and C1 are in good agreement with the calculated 3D graph, while some deviation can be observed at the surfaces with a lower number of points (cases C2, C3 and C4).

In addition to the qualitative comparison of the surfaces in **Figure 9**, in **Table 1** a quantitative comparison is shown between the CFD code simulations and the four OSE predictions generated either from all 40 code calculation values (C0 case; see also **Figure 9(b)**), from 19 code simulation values (C1 case; see also **Figure 9(c)**), from 11 code simulation values (C2 case; see also **Figure 9(d)**) and from 5 code simulation values (C4 case; see also **Figure 9(f)**). It can be seen that in the base case, the difference in the downstream velocity is less than 0.4% (0.19% in average). For the C1, C2 and C4 cases, the maximum difference in the downstream velocity is 1.20% (0.23% in average), 1.44% (0.45% in average) and 2.73% (0.76% in average), respectively. Case C3 is not shown in **Table 1**, but the maximum difference is 1.94% (0.61% in average).

The obtained results suggest that in the case when thousands of complex computer code runs are needed for the Monte Carlo statistical analysis, the surrogate model can be used to replace the numerous CFD code simulations. In the case of a lower accuracy of response surface at a certain discrete value, a few additional calculations can be added in the vicinity of this point to better describe the space until a satisfactory agreement is obtained.

## 6. Conclusions

The mixing of two horizontal turbulent flows was investigated on the GEMIX experimental case. Two sensitivity parameters were considered as uncertainty variables: inlet velocity profile and inlet turbulence intensity. Both variables were treated as the main uncertain parameters with a certain range of randomly distributed values. The NEPTUNE_CFD code was used to perform a series of parametric CFD calculations, combining different values of turbulence intensity and velocity profile at the inlet. The sensitivity analysis of the NEPTUNE_CFD results shows that the implemented turbulence model predicts a higher dissipation of turbulence kinetic energy than observed in the experiment.

A surrogate method called optimal statistical estimator (OSE) was used to generate the full response surface of the output results from the limited number of input CFD predictions. The output results featured downstream velocity and turbulence kinetic energy profiles at few locations in the mixing region. The study demonstrates the efficiency of the response surface generation by the OSE method. The method is especially convenient when few parameters are varied. In the case of a two-dimensional problem, OSE clearly shows its ability to generate a quality response surface even if only a few calculation points are available. This suggests that in the case when thousands of complex computer code runs are needed for the Monte Carlo statistical analysis, the surrogate model can be used to replace the numerous CFD code simulations.