Abstract
The target of this chapter is the evaluation of gradients in inverse problems where spatial field parameters and geometry parameters are treated separately. Such an approach can be beneficial especially when the geometry needs to be detected accurately using L2-norm-based regularization. Emphasis is laid upon the computation of the gradients directly from the governing equations. Working in a statistical framework, the Karhunen-Loève (K-L) expansion is used for discretization of the spatial random field and inversion is done using the gradient-based Hamiltonian Monte Carlo (HMC) algorithm. The HMC gradients involve sensitivities w.r.t the random spatial field and geometry parameters. Building on a method developed by the authors, a procedure is developed which considers the gradients of the associated integral eigenvalue problem (IEVP) as well as the interaction between the gradients w.r.t random spatial field parameters and the gradients w.r.t the geometry parameters. The same mesh and linear shape functions are used in the finite element method employed to solve the forward problem, the artificial elastic deformation problem and the IEVP. Analysis of the rate of convergence using seven different meshes of increasing density indicates a linear rate of convergence of the gradients of the log posterior.
Keywords
- sensitivity analysis
- geometry detection
- random fields
- Hamiltonian Monte Carlo
- inverse problems
1. Introduction
Accurate computation of gradients, w.r.t parameters of interest, is a key aspect of deterministic algorithms like Gauss-Newton, Levenberg–Marquardt, Occam’s inversion [1] as well as statistical algorithms like Hamiltonian Monte Carlo (HMC) [2]. Common nonintrusive methods like finite differences compute the gradient by taking differences between the response at the current model and at a perturbed model, such methods suffer from certain drawbacks. Two types of errors stand out in particular: numerical error involved with truncation of Taylor’s series and round-off error involved with finite precision arithmetic of computers [3]. For more robust analysis, this chapter focuses on methods where the gradient is computed directly from the analytical/numerical model by enlisting the sensitivity equations.
The type of solutions that can be obtained from inverse problems is guided by the regularization term. When the accurate detection of the geometry of embedded objects (or detection of discontinuities) is of interest,
Gradients have to be computed w.r.t both spatial as well geometry parameters. While sensitivity analysis of spatial parameters is usually straightforward, computation of the gradients w.r.t geometry parameters [6, 7] needs to be done more carefully and consider aspects like mesh distortion. This chapter develops on the method presented in [8] and briefly details the simultaneous spatial field and geometry update within the HMC statistical framework. Similar to [8], the Karhunen-Loève (K-L) expansion is used for discretization of the random field, with the difference that the complete theoretical basis is considered. The complete integral eigenvalue problem or IEVP is solved and the associated gradients are computed. The interaction between the geometry and spatial parameters due to the domain of definition of the IEVP is also detailed. It should be noted that the procedure detailed above is applicable to both the direct differentiation and adjoint methods [9] of sensitivity analysis.
The entire numerical study is done with a focus on aspects related to gradients and not with the aim to solve the inverse problem. Nevertheless, a forward model is still required for the computation of gradients. The chapter begins with a description of the forward model, observation equation and the discretization of the spatial random field in Section 2. This is followed by a brief description in Section 3 of the HMC-based methods developed in [8, 10] for simultaneous spatial field and geometry detection. Section 3.3 introduces the new gradients obtained when the complete IEVP is considered. It is shown how the gradients of the eigenvectors involve the computation of a Moore-Penrose pseudoinverse. Finally, the gradient computation procedure is validated and a convergence study is done in Section 4.
2. Inversion preliminaries
2.1 Governing and observation equations
Consider a linear steady seepage flow problem defined on a domain
Standard Dirichlet:
where
The error in Eq. (3) is modeled as Gaussian
Except for linear Gaussian observation models, the posterior cannot be computed analytically and is usually evaluated using MCMC sampling algorithms.
2.2 Karhunen-Loève (K-L) expansion
The parameter vector
The Karhunen-Loève (K-L) expansion method is a series expansion method for the discretization of random fields which is based on the spectral decomposition of the autocovariance function. It can be shown that a random field can be written as an infinite sum [11]:
where
The autocovariance function is symmetric, bounded, and positive semi-definite and has the spectral decomposition
In practice, the eigenvalues decay exponentially fast for smooth functions and algebraically fast for non-smooth autocovariance kernels and the K-L expansion is usually truncated after
The truncated K-L expansion approximation is optimal in the sense that, for a fixed number of terms
It can be shown that the variance of the truncated K-L expansion
Using the property
The derivation for Eq. (10) assumes the random field to be homogeneous, i.e.,
2.3 Galerkin finite element method to solve the integral eigenvalue problem
The Galerkin Finite Element Method (FEM) is used to solve the IEVP on
where the coefficients
In the Galerkin method, the unknown coefficients are determined by making the residual
This results in a generalized eigenvalue problem
where
Both
3. Simultaneous geometry and spatial field detection
3.1 Hamiltonian Monte Carlo
Consider a parameter space
The second term on the right of Eq. (17) is generally called
The introduction of the momentum variables allows for the generation of trajectories through conservative Hamiltonian dynamics [16], which are given as:
These dynamics are exactly reversible (provided the gradient
Starting from a point
3.2 Parameter update using the mesh moving method
The leapfrog equations determine an update in
Computation of the nodal coordinate vector gradient requires the definition of a differentiable map which is additionally reversible and one-to-one, to satisfy the detailed balance condition of MCMC. One such map proposed in [10], is to update from an arbitrarily fixed reference domain
The entire mesh is moved and the new nodal coordinates can be determined as:
where
The displacements in the elastic deformation step can cause distortions in the mesh, especially in regions where large deformation is expected, i.e., near the piping zone boundary. To maintain a good mesh quality for computation purposes, a mesh moving method [19] is used. The simple idea is to scale the elastic modulus
where
3.3 Gradient computation
Analytical methods for the computation of the gradient are intrusive and involve gradients of the steady seepage flow forward solver. From the definition of
Given standard boundary conditions, this equation can be solved to obtain
Following the standard procedure in FEM, the hydraulic conductivity matrix at the element level
where
where the gradient of the hydraulic conductivity field w.r.t
The gradient of
Formulas for the calculation of the gradients
The gradient of the eigenvalues and eigenvectors of the generalized eigenvalue problem in Eq. (14) w.r.t.
As
Considering the same mesh that is used for the solution of the steady seepage flow problem to be used for the discretization of the random field, the gradients of the matrices
In this study, the squared exponential autocorrelation coefficient function has been used, i.e.,
where
This completes the definition of all the terms required to compute the HMC gradient
4. Numerical implementation and results
4.1 Observation data
A seepage zone containing a piping region of length
4.2 Inversion setup
Considering a log-normal hydraulic conductivity field
A preliminary assessment of the eigenvalues obtained by solving the generalized eigenvalue problem in Eq. (14) reveals that the K-L expansion can be truncated at
Eqs. (37) and (38) can be readily differentiated to obtain
As all updates are designed to take place from an arbitrary reference domain as mentioned in Eq. (23), the study of the convergence properties requires a focus on the reference mesh. The reference domain is arbitrarily chosen as the true domain used to generate the observations. Seven different reference configuration meshes with 128, 306, 497, 1038, 1476, 1860, and 3018 nodes are considered for the study of the convergence properties. Three of these meshes are shown in Figure 3. The artificial elastic properties selected for the mesh moving method are
4.3 Inversion and convergence analysis
There are no analytical solutions to verify the correctness of the gradient computation procedure mentioned in Section 3. Hence, for verification purposes, a short inversion analysis is done using HMC, where
In order to study the convergence properties, suitable error measures are required. The rate of convergence of the truncated KL expansion of the prior random field, for a fixed number of parameters
The analytical mean error variance
Other error measures can be computed in a similar manner just by replacing
Finally, the rate of convergence of the gradient of the potential energy function w.r.t the spatial parameter related to the largest eigenvalue
5. Conclusions
This paper details the numerical procedure for the computation of gradients in probabilistic inverse problems involving the simultaneous estimation of spatial fields and geometry. The method is analytical (in the sense of [6]), intrusive and involves the computation of gradients of the forward problem. Emphasis is laid upon the calculation of gradient of eigenvalues and eigenvectors involved with the truncated K-L expansion method for discretization of the random field. The eigenvalues and eigenvectors are obtained by solving a generalized eigenvalue problem on a defined domain. This implies that as the geometry parameters are updated, the domain is updated and the generalized eigenvalues and eigenvectors change. Computation of the gradient of the eigenvectors w.r.t the geometry parameters involve the computation of a generalized inverse.
The gradients are validated through an inverse analysis using HMC. The potential energy decreases rapidly as the Markov chain related to the geometry parameters approaches the region of high posterior probability, indicating the correctness of the computed gradients. Overall the rate of convergence of various quantities is observed to be linear on a log–log plot. This means computation costs can grow exponentially to achieve better results. While the same mesh that is used to solve the forward problem can also be used for the Galerkin FE method-based discretization of the IEVP, the repeated need to solve the forward problem, the mesh moving method and the generalized eigenvalue problem at every step can be computationally prohibitive. Elimination of any one, two or even all three of these numerical problems can significantly improve the computational feasibility of simultaneous spatial field and geometry detection using gradient-based stochastic samplers like HMC.
References
- 1.
Aster RC, Borchers B, Thurber CH. Parameter Estimation and Inverse Problems. Oxford: Academic Press; 2013. DOI: 10.1016/C2009-0-61134-X - 2.
Neal R. MCMC Using Hamiltonian Dynamics. Handb. Markov Chain Monte Carlo. New York: CRC Press; 2011. DOI: 10.1201/b10905-6 - 3.
Voorhees A, Millwater H, Bagley R. Complex variable methods for shape sensitivity of finite element models. Finite Elements in Analysis and Design. 2011; 47 :1146-1156. DOI: 10.1016/J.FINEL.2011.05.003 - 4.
Lee J, Kitanidis PK. Bayesian inversion with total variation prior for discrete geologic structure identification. Water Resources Research. 2013; 49 :7658-7669. DOI: 10.1002/2012WR013431 - 5.
Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena. 1992; 60 :259-268. DOI: 10.1016/0167-2789(92)90242-F - 6.
Christensen PW, Klarbring A. Two-Dimensional Shape Optimization. An Introd. to Struct. Optim. Netherlands: Springer; 2008. DOI: 10.1007/978-1-4020-8666-3_7 - 7.
Haslinger J, Mäkinen RA. Introduction to Shape Optimization-Theory, Approximation, and Computation. Philadelphia: SIAM; 2003 - 8.
Koch MC, Osugi M, Fujisawa K, Murakami A. Hamiltonian Monte Carlo for simultaneous interface and spatial field detection (HMCSISFD) and its application to a piping zone interface detection problem. International Journal for Numerical and Analytical Methods in Geomechanics. 2021; 45 :2602-2626. DOI: 10.1002/NAG.3279 - 9.
Koch MC, Fujisawa K, Murakami A. Adjoint Hamiltonian Monte Carlo algorithm for the estimation of elastic modulus through the inversion of elastic wave propagation data. International Journal for Numerical Methods in Engineering. 2020; 121 :1037-1067. DOI: 10.1002/nme.6256 - 10.
Koch MC, Fujisawa K, Murakami A. Novel parameter update for a gradient based MCMC method for solid-void interface detection through elastodynamic inversion. Probabilistic Engineering Mechanics. 2020; 62 :103097. DOI: 10.1016/j.probengmech.2020.103097 - 11.
Loève M. Probability Theory II. New York: Springer-Verlag; 1978 - 12.
Ghanem RG, Spanos PD. Stochastic Finite Elements: A Spectral Approach. Springer New York: New York, NY; 1991. DOI: 10.1007/978-1-4612-3094-6 - 13.
Sudret B, Der Kiureghian A. Stochastic Finite Element Methods and Reliability: A State-of-the-Art Report. UCB/SEMM-2000/08. Berkeley: University of California; 2000 - 14.
Betz W, Papaioannou I, Straub D. Numerical methods for the discretization of random fields by means of the Karhunen–Loève expansion. Computer Methods in Applied Mechanics and Engineering. 2014; 271 :109-129. DOI: 10.1016/J.CMA.2013.12.010 - 15.
Marzouk YM, Najm HN. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. Journal of Computational Physics. 2009; 228 :1862-1902. DOI: 10.1016/j.jcp.2008.11.024 - 16.
Leimkuhler B, Reich S. Simulating Hamiltonian Dynamics. Cambridge: Cambridge University Press; 2005. DOI: 10.1017/CBO9780511614118 - 17.
Betancourt M. A conceptual introduction to hamiltonian Monte Carlo. ArXiv Preprint. 2017 - 18.
Andrieu C, De Freitas N, Doucet A, Jordan MI. An introduction to MCMC for machine learning. Machine Learning. 2003:5-43. DOI: 10.1023/A:1020281327116 - 19.
Stein K, Tezduyar T, Benney R. Mesh moving techniques for fluid-structure interactions with large displacements. Journal of Applied Mechanics. 2003; 70 :58-63. DOI: 10.1115/1.1530635 - 20.
Koch MC, Osugi M, Fujisawa K, Murakami A. Investigation of reference parameter invariance and application of HMCSISFD for identification of an eroded seepage zone. Proc. 13th Int. Conf. Struct. Saf. Reliab. 2021–2022, Shanghai. 2022 - 21.
Magnus JR. On Differentiating Eigenvalues and Eigenvectors. Econometric Theory. 1985; 1 :179-191. DOI: 10.1017/S0266466600011129