Random variable types and the corresponding orthogonal polynomials.
Uncertainty propagation (UP) methods are of great importance to design optimization under uncertainty. As a well-known and rigorous probabilistic UP approach, the polynomial chaos expansion (PCE) technique has been widely studied and applied. However, there is a lack of comprehensive overviews and studies of the latest advances of the PCE methods, and there is still a large gap between the academic research and engineering application for PCE due to its high computational cost. In this chapter, latest advances of the PCE theory and method are elaborated, in which the newly developed data-driven PCE method that does not depend on the complete information of input probabilistic distribution as the common PCE approaches is introduced and improved. Meanwhile, the least angle regression technique and the trust region scenario are, respectively, extended to reduce the computational cost of data-driven PCE to accommodate it to practical engineering design applications. In addition, comprehensive comparisons are made to explore the relative merits of the most commonly used PCE approaches in the literature to help designers to choose more suitable PCE techniques in probabilistic design optimization.
- uncertainty propagation
- probabilistic design
- polynomial chaos expansion
- trust region
Uncertainties are ubiquitous in engineering problems, which can roughly be categorized as aleatory and epistemic uncertainty [1, 2]. The former represents natural or physical randomness that cannot be controlled or reduced by designers or experimentalists, while the latter refers to reducible uncertainty resulting from a lack of data or knowledge. In systems design, all sources of uncertainties need to be propagated to assess the uncertainty of system quantities of interest, i.e., uncertainty propagation (UP). As is well known, UP is of great importance to design under uncertainty, which greatly determines the efficiency of the design. Since generally sufficient data are available for aleatory uncertainties, probabilistic methods are commonly employed for computing response distribution statistics based on the probability distribution specifications of input [3, 4]. Conversely, for epistemic uncertainties, data are generally sparse, making the use of probability distribution assertions questionable and typically leading to nonprobabilistic approaches, such as the fuzzy, evidence, and interval theories [5–7]. This chapter mainly focuses on propagating the aleatory uncertainties to assess the uncertainty of system quantities of interest using probabilistic methods, which is shown in Figure 1.
A wide variety of probabilistic UP approaches for the analysis of aleatory uncertainties have been developed , among which the polynomial chaos expansion (PCE) technique is a rigorous approach due to its strong mathematical basis and ability to produce functional representations of stochastic quantities. With PCE, the function with random inputs can be represented as a stochastic metamodel, based on which lower-order statistical moments as well as reliability of the function output can be derived efficiently to facilitate the implementation of design optimization under uncertainty scenarios such as robust design  and reliability-based design . The original PCE method is an intrusive approach in the sense that it requires extensive modifications in existing deterministic codes of the analysis model, which is generally limited to research where the specialist has full control of all model equations as well as detailed knowledge of the software. Alternatively, nonintrusive approaches have been developed without modifying the original analysis model, gaining increasing attention, thus is the focus of this chapter. As a well-known PCE approach, the generalized PCE (gPCE) method based on the Askey scheme [11, 12] has been widely applied to UP for its higher accuracy and better convergence [13, 14] compared to the classic Wiener PCE . Generally, the random input does not necessarily follow the five types of probabilistic distributions (i.e., normal, uniform, exponential, beta, and gamma) in the Askey scheme. In this case, the transformation should be made to transfer each random input variable to one of the five distributions. It would induce substantially lower convergence rate, which makes the nonoptimal application of Askey polynomial chaos computationally inefficient . Therefore, the Gram-Schmidt PCE (GS-PCE)  and multielement PCE (ME-PCE)  methods have been developed to accommodate arbitrary distributions through constructing their own orthogonal polynomials rather than referring to the Askey scheme.
All the PCE methods discussed above are constructed based on the assumption that the exact knowledge of the involved joint multivariate probability density function (PDF) of all random input variables exists. Generally, by assumption of independence of the random variables, the joint PDF is factorized into univariate PDFs of each random variable in introducing PCE in the literature. However, the random input could exist as some raw data with a complicated cumulative histogram, such as bi-modal or multi-modal type, for which it is often difficult to obtain the analytical expression of its PDF accurately. Under these scenarios, all the above PCE approaches become ineffective since they all have to assume the PDFs to be complete. To address this issue, the data-driven PCE (DD-PCE) method has been proposed , in which its accuracy and convergence with diverse statistical distributions and raw data are tested and well demonstrated. With this PCE method, the one-dimensional orthogonal polynomial basis is constructed directly based on a set of data of the random input variables by matching certain order of their statistic moments, rather than the complete distributions as in the existing PCE methods, including gPCE, GS-PCE, and ME-PCE.
At present, great research achievements about PCE have been made in the literature, which have also been applied to practical engineering problems to save the computational cost in UP. However, there is still a large gap between the academic study and engineering application for the PCE theory due to the following reasons: (1) the complete information of input PDF often is not known in engineering, which cannot be solved by most PCE methods presented in the literature; (2) the computational cost of existing PCE approaches is still very high, which cannot be afforded in practical problems, especially when applied to design optimization; and (3) there is a lack of comprehensive exploration of the relative merits of all the PCE approaches to help designers to choose more suitable PCE techniques in design under uncertainty.
2. Data-driven polynomial chaos expansion method
Most PCE methods presented in the literature are constructed based on the assumption that the exact knowledge of the involved PDF of each random input variable exists. However, the PDF of a random parameter could exist as some raw data or numerically as a complicated cumulative histogram, such as bimodal or multimodal type, which is often difficult to obtain the analytical expression of its PDF accurately. To address this issue, the data-driven PCE method (DD-PCE for short in this chapter) has been proposed. DD-PCE follows the similar general procedure as that of the well-known gPCE method. For gPCE, the one-dimensional orthogonal polynomial basis simply comes from the Askey scheme in Table 1 and is a function of standard random variables. While for DD-PCE, the one-dimensional orthogonal polynomial basis is constructed directly based on the data of random input by matching certain order of statistic moments of the random inputs and is a function of the original random variables.
2.1. Procedure of data-driven PCE method
corresponding to the
where is the unknown polynomial coefficient to be solved.
Since the construction of on each dimension is the same, the subscript
In the same way as above, one has
There are totally
It is observed that is actually the
where is the
Clearly, to obtain a
2.2. Extension of Galerkin projection to DD-PCE
In the existing work about DD-PCE, only the regression method is employed to calculate the PCE coefficients. To the experience of the authors, the matrix during regression may become ill-conditioned during regression for higher-dimensional problems since the sample points required for regression that is often set as two times of the number of PCE coefficients
With the projection method, the Galerkin projection is conducted on each side of Eq. (1):
where 〈•〉 represents the operation of inner product as below
Based on the orthogonality property of orthogonal polynomials, the PCE coefficient can be calculated as
Similar to gPCE, the key point is the computation of the numerator in Eq. (11), which can be expressed as
The Gaussian quadrature technique, such as full factorial numerical integration (FFNI) and spare grid numerical integration, has been widely used to calculate the numerator in the existing gPCE approaches, with which the one-dimensional Gaussian quadrature nodes and weighs are directly derived by multiplying some scaling factors on the nodes and weights from the existing Gaussian quadrature formulae and then the tensor product is employed to obtain the multidimensional nodes. For some common type of probability distributions, for example, normal, uniform, and exponential distributions, their PDFs have the similar formulations as the weighting functions of the Gaussian-Hermite, Gaussian-Legendre, and Gaussian-Laguerre quadrature formula. Therefore,
|Normal||Hermite ||[−∞, +∞]|
|Uniform||1/2||Legendre ||1||[−1, 1]|
|Exponential||Laguerre ||[0, +∞]|
|Gamma||General Laguerre||[0, +∞]|
However, the distributions of random inputs may not follow the Askey scheme, or are even nontrivial, or even exist in some raw data with a cumulative histogram of complicated shapes. Thus, such way to derive these nodes and weighs is not applicable in this case. In this work, a simple method is proposed based on the moment-matching equations below to obtain the one-dimensional quadrature nodes and weights.
However, Eq. (13) are multivariate nonlinear equations, which are difficult to solve when the number of equations is large (
In the same way, the nodes and weights in other dimensions are obtained conveniently. Then, the numerator can be calculated by the full factorial numerical integration (FFNI) method  for lower-dimensional problems (
where and , respectively, represent the one-dimensional nodes and weights of the
For higher-dimensional problems (
For the FFNI-based method, if
In this chapter, we focus on extending the Galerkin projection to the DD-PCE method to address higher-dimensional UP problems and then exploring the relative merits of these PCE approaches. For the case with only small data sets, both DD-PCE and the existing distribution-based method (
2.3. Comparative study of various PCE methods
In this section, the enhanced DD-PCE method, the recognized gPCE method, and the GS-PCE method that can address arbitrary random distributions are applied to uncertainty propagation to calculate the first four statistic moments (mean
The PCE order is set as
In Case 1, all the random input distributions are known and belong to the Askey scheme. The test results are shown in Tables 4–7, where the bold numbers with underline are the relatively best results and
In Case 2, all the random input distributions are known but do not belong to the Askey scheme. In this case, the Rosenblatt transformation is employed for the gPCE method first. However, DD-PCE and GS-PCE can be directly used. The results are shown in Tables 8–11. It is observed that overall DD-PCE and GS-PCE perform better than gPCE, yielding results that are close to those of MCS. The reason is that the transformation in gPCE would induce error. Specifically, in Tables 9 and 10, the gPCE method causes relatively large errors due to the transformation. In addition, note the numbers with shadow, they are clearly larger than those of DD-PCE and GS-PCE, and
In Case 3, the PDFs of some variables is bounded (BD) as below,
and the rest of the variables follow typical distributions. In this case, the Rosenblatt transformation is also employed for the gPCE method first.
From the results in Tables 12–15, it is found that generally large errors are induced by gPCE, especially the numbers with shadow in the tables. Since the first two variables follow the distribution bounded in an interval, the error induced by the transformation is large and all values of
In Case 4, the distributions of the random input variables are unknown and only some data exist. Although, based on the data, the analytical PDF can be obtained through some experience systems, such as Johnson or Pearson system , if the distribution of the data is very complicated, such as with a complicated cumulative histogram of bi- or multimodes, it is often very difficult to obtain the analytical PDF accurately. As is well-known that the Pearson system based on the first four statistic moments of the random variable would produce large errors for bimode (BM) or multimode PDFs. Evidently, the existing PCE approaches, including gPCE and GS-PCE, may produce large errors since they all depend on the exact PDFs of the random inputs in this case. However, DD-PCE can still work since it is a data-driven approach. To explore the effectiveness and advantage of DD-PCE over the other two approaches, it is assumed that the input data for some random input variables have a complicated bimode (BM) histogram shown in Figure 3 and the data for the rest from the typical distributions. Therefore, for the convenience and effectiveness of test, all the input data are generated based on the PDFs, of which the PDF of BM distribution is shown in Eq. (17). It should be pointed out that the PDFs actually are unknown and only some data exist in practice.
We tested small (500) and large (107) numbers of input data to investigate the impact of number of data on the accuracy of UP. The results are shown in Tables 16–19, from which it is noticed that the results of DD-PCE are generally very close to those of MCS when the number of sample points of the random input variables is large (107). When only 500 sample points are used, the errors are much larger. It means that the accuracy of DD-PCE is improved with the increase of the number of sample points. The reason is very simple that with the increase of the number of sample points, the statistic moments of random input variables calculated are more accurate, which would undoubtedly increase the accuracy of UP. The observation exhibits great agreements to what has been reported in work of Oladyshkin and Nowak. Similar to Case 3, the estimated
|Methods||MCS||DD-PCE (107)||DD-PCE (500)|
To study the convergence property of the enhanced DD-PCE method, the errors (
Overall, the three approaches produce comparably good results when the random inputs follow the Askey scheme. However, gPCE is the most mature and convenient to be implemented since there is no need to construct the orthogonal polynomials. When the PDFs of random inputs are unknown but do not follow the Askey scheme, large errors would be induced by the transformation for gPCE and the rest two PCE methods are comparable in accuracy and implementation complexity. It should also be pointed out that for DD-PCE, when constructing one-dimensional polynomials, the statistic moments (often 0–10 order) should be calculated first. If large gap exists between the high-order and low-order moments, the matrix singularity would happen in solving the linear equations (Eq. (7)). Therefore, in this case, GS-PCE is preferable especially when the function is highly nonlinear. When the PDF is unknown and cannot be obtained accurately, such as when random inputs exist as some raw data with a complicated cumulative histogram, only the DD-PCE method can still perform well since it is a data-driven method instead of the probabilistic-distribution-driven, while large errors would be produced if GS-PCE and gPCE are employed. However, more efforts should be made to solve the numerical problems in the DD-PCE method to make it more robust and applicable in constructing the one-dimensional orthogonal polynomials.
3. A sparse data-driven PCE method
The size of the truncated polynomial terms in the full PCE model is increased with the increase of the dimension of random inputs
Although the computational cost and accuracy are dependent on the PCE order, how to determine a suitable order that compromises between accuracy and efficiency is not within the scope of this chapter. In common situations, PCE of order
3.1. Procedure of data-driven PCE method
A step-by-step description of the proposed method is given in detail as below with a side-by-side flowchart in Figure 8.
Then one has all the standardized data as
Once the PCE coefficients are calculated, the predicted value by the candidate sparse PCE model at the sample point
To evaluate the accuracy more effectively, the relative error is employed based on
where denotes the empirical variance of the response sample points
If the accuracy
In this work, if the PDF of random input is known, a large number of sample points are generated as the database according to the PDF beforehand; if the PDF of random input is unknown, the raw data are considered as the database. Each sample point in the database has its own index. The initial sample points are selected from the database through randomly and uniformly generating their indices. Then these sample points will be removed from the database and the rest will be indexed again. Similarly, by randomly and uniformly generating the indices, the sequential sample points will be selected from the reduced database. By using this sampling strategy, the sample points are distributed uniformly as far as possible, which is helpful to improve the accuracy of the PCE coefficient calculation.
3.2. Comparative study
In this section, the proposed sparse DD-PCE method (shortened as sDD-PCE hereafter) is applied to three mathematical examples to calculate the mean and variance of the output responses. The full DD-PCE (shortened as fDD-PCE hereafter) method that adopts a full PCE structure and one-stage sampling with the size of one times the number of PCE coefficients is also applied to UP, of which the results are compared to those of sDD-PCE to demonstrate its effectiveness and advantage.
The test examples of varying dimensions including their input information are shown in Table 20, in which the symbols and respectively, denote normal, uniform, and exponential distribution. To fully explore the applicability of sDD-PCE, three different cases of the random input information that almost cover all the situations in practice (Case 1: raw data; Case 2: common distribution; Case 3: nontrivial distribution) are considered. The nontrivial bimodal distribution (denoted as ) used in Section 2.3 (Eq. (16)) is considered.
Another type of nontrivial distribution considered here is invented by conducting square operation on the sample points from some common distributions (see Case 3 in Function 2). The target accuracy
The results are listed in Tables 21–23, in which
From the results some noteworthy observations are made. First, generally with high PCE order (
In Case 2, the PDFs of all the random inputs are known and assumed to follow common distributions. This is a general case that can be solved by the traditional probabilistic distribution-based PCE methods. The results are shown in Tables 24–26. Generally with high PCE order (
In Case 3, the PDFs of all the random inputs are known; however, some of them follow nontrivial distributions. In this case, the traditional gPCE method cannot work well since large errors would be induced in transforming such nontrivial distributions to certain ones in the Askey scheme. The results are shown in Tables 27–29, which exhibit great agreements to what has been observed in Case 1 and Case 2. The proposed sDD-PCE method can significantly reduce the number of sample points while with high accuracy. The higher the dimension, the more advantageous the adaptive sparse structure of sDD-PCE can be. In this case, only 11 polynomial terms are selected from 3003 total terms for
To verify the guess that for low-dimensional problems with low-order PCE models, fDD-PCE may produce more accurate results than sDD-PCE since it maintains more information. Another test is conducted for Function 1 with lower order
|Case 1||Case 2||Case 3|
The developed sDD-PCE can reduce the number of polynomial terms in the PCE model, thus reducing the computational cost. Generally, the larger the random input dimension, the more obvious the advantage of the developed sDD-PCE over fDD-PCE in efficiency. The sDD-PCE method is much more efficient than fDD-PCE in solving high-dimensional problems, especially those requiring a high order PCE model.
4. Sparse DD-PCE-based robust optimization using trust region
In Section 3, to reduce the computational cost of DD-PCE, a sparse DD-PCE method has been developed by removing some insignificant polynomial terms from the full PCE model, thus decreasing the number of samples for regression in computing PCE coefficients. However, when the sparse DD-PCE is applied to robust optimization, it is conventionally a triple-loop process (see Figure 9): the inner one tries to identify the insignificant polynomial terms of the PCE model (the dash box); the middle is UP; the outer is the search for optima, which clearly is still very time-consuming for problems with expensive simulation models.
As has been mentioned in Section 3, during each optimization iteration, although the sample points required for regression during UP of sDD-PCE are greatly reduced, certain additional number of sample points are required to identify the insignificant polynomial terms by the inner loop. If at some iteration design points, almost the same sparse polynomial terms are retained, the inner loop can clearly be avoided, thus saving the computational cost. To address this issue, the trust region technique widely used in nonlinear optimization is extended in this section. During optimizing, a trust region is dynamically defined. If the updated design point lies in the current trust region, it is considered that the insignificant terms of its PCE model remain unchanged compared to those of the last design point, i.e., the inner loop is eliminated at the updated design point. Meanwhile, to further save the computational cost, the sample points lying in the overlapping area of two adjacent sampling regions are reused for the PCE coefficient regression for the updated design point. The proposed robust optimization procedure employing sparse DD-PCE in conjunction with the trust region scenario is applied to several examples of robust optimization, of which the results are compared to those obtained by the robust optimization without the trust region method, to demonstrate its effectiveness and advantage.
4.1. The trust region scenario
The trust region method is a traditional approach that has been widely used in nonlinear numerical optimization . The basic idea of the trust region method is that in the trust region of the current iteration design point, the second-order Taylor expansion is used to approximate the original objective function. If the accuracy of the current second-order Taylor expansion is satisfied, the size of the trust region is increased to speed up the convergence, and if not it is reduced to improve the accuracy of approximation. To reduce the computational cost of design optimization, the idea of the trust region technique has been extended and applied to reliability-based wing design optimization , multifidelity wing aero-structural optimization , and multifidelity surrogate-based wing optimization , which has been widely believed as an efficient strategy in design optimization. For example, when the trust region technique is applied to meta-model-based design optimization, during optimization, the sample points are sequentially generated in the trust region and the radius of the trust region is dynamically adjusted based on the accuracy of the meta-model in the local region.
4.2. Robust design using sparse data-driven PCE and trust region
The scenario of trust region is extended here to reduce the computational cost of sDD-PCE-based robust optimization. The basic idea is that the radius of a trust region is determined by the distance between two successive design points and the variation of the corresponding objective function values. If the updated design point lies in the current trust region, it is considered that the insignificant terms of its PCE model remain unchanged compared to those of the last design point , i.e., the inner loop is eliminated at the updated design point. Meanwhile, the sample points lying in the overlapping area of two adjacent sampling regions are reused for the PCE coefficient regression for the updated design point to further save the computational cost. Generally, for a practical engineering optimization problem, there is only one performance function that is computationally expensive. Therefore, only one PCE model is required to be constructed and the UP for the rest of the functions can be conveniently implemented by MCS. In this study, it is assumed that the PCE model is only constructed for the objective function and the general steps of the proposed method is as below.
where and |
The above procedure will continue until the convergent criterion is satisfied. Figure 10 shows the case that the sample points in the previous optimization iteration are reused in the two successive iterations. As is seen that two points are located in the overlapping area of two successive sampling regions, thus are reused in the next iteration for regression to identify the significant polynomials/calculate the PCE coefficients. In this way, the computational cost can be further reduced.
4.3. Comparative studies
The first example is the Ackley Function:
The robust design optimization of this example is:
All the design variables are considered to follow uniform distribution with variation of ±0.2 around their mean values and
The results are shown in Table 31, from which it is found that compared to the robust optimization without the trust region scenario (denoted by without), the obtained performance results (
The second example is the robust design optimization of an automobile torque arm, shown in Figure 11.
In this problem, the four geometrical parameters (
where the objective function
The distribution parameters of the four design variables and design parameters are shown in Table 32.
|Random variables||Distribution||Lower bound||Upper bound|
|Deterministic||2.1 × 1010 N/mm2|
The corresponding robust design optimization model is formulated as
As has been mentioned above, the PCE model is only constructed for the objective function and the results are shown in Table 33. It is noticed that the robust optimization designs with and without the trust region scenario yields comparable results, while the function calls (objective function calls) required by design with trust region is evidently smaller. The deterministic design cannot even obtain a feasible optimal solution with both constraint violated (>0), since it does not consider uncertainties during design. These results further demonstrate the effectiveness and advantage of the proposed method.
|DD||[8.13, 55.00, 55.00, 110.00]||2.6616e4||1.2171e3||1||82|
|with||[8.53, 54.10, 58.67, 111.03]||3.1027e4||1.3355e3||1.1315||−0.0123||−1.1833e5||658|
|without||[8.57, 52.68, 57.50, 110.00]||3.0332e4||1.3093e3||1.1077||−4.0000e−4||−1.2913e2||1283|
The employment of the trust region in sDD-PCE-based robust optimization can evidently reduce the computational cost. However, the determination of the trust region in this chapter is still very subjective and a more rigorous method should be explored. In this section as well as Section 3, the scenarios of sparse PCE and trust region are only employed to DD-PCE to save the computational cost. However, the methods proposed here are also applicable to other PCE approaches, such as gPCE and GS-PCE.
In this chapter, the latest advances in PCE theory and approach for probabilistic UP are comprehensively presented in detail. However, it does not limit the application of PCE to nonprobabilistic UP to address epistemic uncertainties. Sudret and Schöbi have proposed a two-level metamodeling approach using nonintrusive sparse PCE to surrogate the exact computational model to facilitate the uncertainty quantification analysis, in which the input variables are modeled by probability-boxes (