Open access peer-reviewed chapter

Numerical Gradient Computation for Simultaneous Detection of Geometry and Spatial Random Fields in a Statistical Framework

Written By

Michael Conrad Koch, Kazunori Fujisawa and Akira Murakami

Submitted: 12 September 2022 Reviewed: 29 September 2022 Published: 01 November 2022

DOI: 10.5772/intechopen.108363

From the Edited Volume

Inverse Problems - Recent Advances and Applications

Edited by Ivan I. Kyrchei

Chapter metrics overview

55 Chapter Downloads

View Full Metrics

Abstract

The target of this chapter is the evaluation of gradients in inverse problems where spatial field parameters and geometry parameters are treated separately. Such an approach can be beneficial especially when the geometry needs to be detected accurately using L2-norm-based regularization. Emphasis is laid upon the computation of the gradients directly from the governing equations. Working in a statistical framework, the Karhunen-Loève (K-L) expansion is used for discretization of the spatial random field and inversion is done using the gradient-based Hamiltonian Monte Carlo (HMC) algorithm. The HMC gradients involve sensitivities w.r.t the random spatial field and geometry parameters. Building on a method developed by the authors, a procedure is developed which considers the gradients of the associated integral eigenvalue problem (IEVP) as well as the interaction between the gradients w.r.t random spatial field parameters and the gradients w.r.t the geometry parameters. The same mesh and linear shape functions are used in the finite element method employed to solve the forward problem, the artificial elastic deformation problem and the IEVP. Analysis of the rate of convergence using seven different meshes of increasing density indicates a linear rate of convergence of the gradients of the log posterior.

Keywords

  • sensitivity analysis
  • geometry detection
  • random fields
  • Hamiltonian Monte Carlo
  • inverse problems

1. Introduction

Accurate computation of gradients, w.r.t parameters of interest, is a key aspect of deterministic algorithms like Gauss-Newton, Levenberg–Marquardt, Occam’s inversion [1] as well as statistical algorithms like Hamiltonian Monte Carlo (HMC) [2]. Common nonintrusive methods like finite differences compute the gradient by taking differences between the response at the current model and at a perturbed model, such methods suffer from certain drawbacks. Two types of errors stand out in particular: numerical error involved with truncation of Taylor’s series and round-off error involved with finite precision arithmetic of computers [3]. For more robust analysis, this chapter focuses on methods where the gradient is computed directly from the analytical/numerical model by enlisting the sensitivity equations.

The type of solutions that can be obtained from inverse problems is guided by the regularization term. When the accurate detection of the geometry of embedded objects (or detection of discontinuities) is of interest, L1-norm-based difference priors have proven to be useful [4, 5]. However, special techniques need to be used to accommodate the nondifferentiability of the L1-norm [1]. Hence, to make use of a large library of adjoint-based inversion solvers (that use gradients w.r.t the parameters), L2-norm-based priors which readily allow for differentiation are more popular in practice. However, L2-norm regularization or Gaussian priors in a stochastic sense only admit smooth solutions. Hence, in order to still be able to capture discontinuities, this paper explicitly parameterizes the shape of the boundary (or the geometry of the domain). This approach thereby considers two sets of parameters, one related to the spatial field and the second with the geometry parameters. This approach is of course only applicable when the unknown geometry can be parameterized explicitly.

Gradients have to be computed w.r.t both spatial as well geometry parameters. While sensitivity analysis of spatial parameters is usually straightforward, computation of the gradients w.r.t geometry parameters [6, 7] needs to be done more carefully and consider aspects like mesh distortion. This chapter develops on the method presented in [8] and briefly details the simultaneous spatial field and geometry update within the HMC statistical framework. Similar to [8], the Karhunen-Loève (K-L) expansion is used for discretization of the random field, with the difference that the complete theoretical basis is considered. The complete integral eigenvalue problem or IEVP is solved and the associated gradients are computed. The interaction between the geometry and spatial parameters due to the domain of definition of the IEVP is also detailed. It should be noted that the procedure detailed above is applicable to both the direct differentiation and adjoint methods [9] of sensitivity analysis.

The entire numerical study is done with a focus on aspects related to gradients and not with the aim to solve the inverse problem. Nevertheless, a forward model is still required for the computation of gradients. The chapter begins with a description of the forward model, observation equation and the discretization of the spatial random field in Section 2. This is followed by a brief description in Section 3 of the HMC-based methods developed in [8, 10] for simultaneous spatial field and geometry detection. Section 3.3 introduces the new gradients obtained when the complete IEVP is considered. It is shown how the gradients of the eigenvectors involve the computation of a Moore-Penrose pseudoinverse. Finally, the gradient computation procedure is validated and a convergence study is done in Section 4.

Advertisement

2. Inversion preliminaries

2.1 Governing and observation equations

Consider a linear steady seepage flow problem defined on a domain zΩRd, d23, where kz is a symmetric spatially varying hydraulic conductivity matrix, hz is the hydraulic head, and Qz is a source term as shown in Eq. (1) below

.kzhz=Qz.E1

Standard Dirichlet: h=h¯ on ΓD, and Neumann: fn=f.n=kh.n on ΓN boundary conditions are applied. Following a spatial discretization of the weak form of the PDE using the finite element method, the governing equation can be written as:

Kθh=q,E2

where K is the global hydraulic conductivity matrix and h and q are the nodal hydraulic head and flux vectors respectively. The parameter vector θ=θ1θ2RK in Eq. (2), includes all the unknowns related to the inverse problem. In this study, the unknowns are divided into two sets [8]: θ1RK1, related to the spatial discretization of a random field and θ2RK2, and related to the definition of the geometry of the domain Ω. Also, consider an observation model relating the state vector m=hq to the discrete observations y, through a map H that is independent of θ i.e.

y=Hmθ+r.E3

The error in Eq. (3) is modeled as Gaussian rN0R with a known covariance matrix R. Parameter estimation is done in a probabilistic sense using Bayesian inference. Starting with a Gaussian prior distribution pθ=Nθ0,Σθ and a likelihood distribution pyθ=NyHmθ,R, the posterior distribution is written as:

pθypyθpθ.E4

Except for linear Gaussian observation models, the posterior cannot be computed analytically and is usually evaluated using MCMC sampling algorithms.

2.2 Karhunen-Loève (K-L) expansion

The parameter vector θ1 defined in Section 2.1 is associated with a continuous hydraulic conductivity spatial random field kzω, where z is defined on the domain Ω and ω belongs to the space of random events Θ. Let the expected value of the random field be denoted as k¯:ΩR and the autocovariance function C:Ω×ΩR be defined as Czz=σzσzρzz. Here σ:ΩR is the standard deviation function and ρ:Ω×Ω11 is the autocorrelation coefficient function. The study in this chapter is confined to Gaussian random fields that can be defined completely by their mean and autocovariance functions.

The Karhunen-Loève (K-L) expansion method is a series expansion method for the discretization of random fields which is based on the spectral decomposition of the autocovariance function. It can be shown that a random field can be written as an infinite sum [11]:

kzω=k¯z+k=1λkθ1kϕkz,E5

where θ1k:ΘR are standard uncorrelated random variables, λk are the eigenvalues (always non-negative) and ϕkz are the eigenfunctions of the linear operator related to the covariance kernel C. They can be obtained by solving the homogeneous Fredholm integral eigenvalue problem (IEVP) on the domain Ω:

ΩCzzϕkzdz=λkϕkz.E6

The autocovariance function is symmetric, bounded, and positive semi-definite and has the spectral decomposition Czz=k=1λkϕkzϕkz. The eigenfunctions are orthogonal, and in a normalized form satisfy the condition Ωϕkzϕlzdz=δkl, where δkl is the Kronecker delta. In the case of Gaussian random fields, the random variables θ1k are also independent and follow the standard normal distribution.

In practice, the eigenvalues decay exponentially fast for smooth functions and algebraically fast for non-smooth autocovariance kernels and the K-L expansion is usually truncated after K1 terms. If the eigenvalues are arranged in descending order such that λ1>λ2>>λK1, then accompanied by the associated eigenfunctions, the truncated K-L expansion approximation of the random field can be written as

k̂zω=k¯z+k=1K1λkθ1kϕkz,E7

The truncated K-L expansion approximation is optimal in the sense that, for a fixed number of terms K1, the mean square error over the domain is minimized [12]. A global error measure related to random field discretization is called the mean error variance ε¯σ and is defined as [13]:

ε¯σz=1ΩΩVarkzωk̂zωVarkzωdz.E8

It can be shown that the variance of the truncated K-L expansion k̂zω is:

Vark̂zω=k=1K1λkϕk2z.E9

Using the property Eθ1kθ1l=δkl, the mean error variance can be calculated as [14]:

ε¯σ,KL=11Ωσ2k=1K1λk.E10

The derivation for Eq. (10) assumes the random field to be homogeneous, i.e., σz=σ. We only consider the case where the prior random field is homogeneous and Gaussian. This assumption is for numerical convenience and does not limit the posterior, which can be non-Gaussian and non-homogeneous [15].

2.3 Galerkin finite element method to solve the integral eigenvalue problem

The Galerkin Finite Element Method (FEM) is used to solve the IEVP on Ω. The eigenfunctions are approximated with the help of the shape functions Nj:ΩR of the FE mesh, and is represented as:

ϕkzj=1ndkjNjz,E11

where the coefficients dkjR are unknown and n is the number of nodes in the FE mesh. Substitution of Eq. (11) into Eq. (6), yields the residual:

rz=j=1ndkjΩCzzNjzdzλjNjz,E12

In the Galerkin method, the unknown coefficients are determined by making the residual rz orthogonal to the space spanned by the shape functions i.e.

ΩrzNizdz=0j=1,,n.E13

This results in a generalized eigenvalue problem

Bdk=λkMdk,E14

where

Bij=ΩNizΩCzzNjzdzdzand
Mij=ΩNizNjzdz.E15

Both B and M are n×n matrices that involve integrals over the domain Ω. Hence the actual geometry of the domain has to be considered for integration. The maximum number of available eigenpairs is n, but in practice, the K-L expansion can usually be truncated at K1 terms such that K1n. As such, for computational efficiency, it is sufficient to compute the first K1 eigenpairs only, which can be done through the Lanczos algorithm.

Advertisement

3. Simultaneous geometry and spatial field detection

3.1 Hamiltonian Monte Carlo

Consider a parameter space θRK augmented with equidimensional momentum variables pRK and a joint probability distribution with density pθp defined over this augmented space. If the underlying distribution over the momentum variables is chosen to be a Gaussian: ppNp0,M, where M is user-specified and independent of θ, then a joint probability distribution can be defined as pθp=pppθy. The Hamiltonian H:RK×RKR is then defined as:

Hθp=logpplogpθyE16

The second term on the right of Eq. (17) is generally called φ:RKR and is given as φθ=logpθy.

The introduction of the momentum variables allows for the generation of trajectories through conservative Hamiltonian dynamics [16], which are given as:

dθdtdpdt=0110HθHp=φθθM1p.E17

These dynamics are exactly reversible (provided the gradient φθθ is one-to-one) and preserve volume as Eq. (18) is just a rotation transformation in θp space. Except for simple problems Eq. (18) cannot be solved analytically and is usually solved using the leapfrog method, which is a second-order accurate numerical integrator given as:

pt+ϵ2=ptϵ2φθtθ,E18
θt+ϵ=θt+ϵM1pt+ϵ2andE19
pt+ϵ=pt+ϵ2ϵ2φθt+εθ.E20

Starting from a point θjpj, these equations are applied repeatedly for L steps, each with a step-size ϵ, to determine a transition to a new point θj+1pj+1, which lies on the same Hamiltonian level-set as θjpj. The deterministic part of Hamiltonian Monte Carlo (HMC) [2] is defined by Eqs. (19)(21). The stochastic part of HMC comes from resampling pN0M. The statistical efficiency of Hamiltonian Monte Carlo stems from the fact that the gradient-guided transitions can propose new points that are “far-away” from the starting point, thereby enabling efficient sampling of the posterior. This is in contrast to the random nature of transitions, which suffer from the curse of dimensionality [17], in conventional MCMC algorithms. As shown in Eqs. (19)(21), critical to the success of HMC, is the computation of the gradient φθθ. Special attention must be paid to maintaining the reversibility of the transitions to satisfy the detailed balance condition [18] for MCMC algorithms. This is detailed along with the gradient computation procedure in the following sections.

3.2 Parameter update using the mesh moving method

The leapfrog equations determine an update in θp space. In particular, the update from θtθt+ϵ in Eq. (20), defines not only a new realization of the random field, but also a new domain, i.e., Ωθ2. Without loss of generality, consider a domain in 2D discretized such that Zθ2R2 is the nodal coordinate vector of all the nodes of the finite element mesh. Let Zvθ2R2 represent a subset of this vector that includes only the node coordinates at the piping zone boundary. Koch et al. [10] show that the computation of the gradient φθθ, by analytical methods [6], ultimately involves the computation of the gradient of the nodal coordinate vector Zθθ.

Computation of the nodal coordinate vector gradient requires the definition of a differentiable map which is additionally reversible and one-to-one, to satisfy the detailed balance condition of MCMC. One such map proposed in [10], is to update from an arbitrarily fixed reference domain Ωrefθ2ref, defined by an arbitrary parameter θ2ref. Let Zrefθ2refRn×2 be the nodal coordinate vector on the discretized domain Ωref and Zvrefθ2refRnv×2 represent a subset of this vector that includes only the coordinates of the nv nodes at the piping zone boundary. The nodal coordinates Zvθ2 and Zvrefθ2ref can be determined explicitly if θ2 and θ2ref are known respectively. Following the update, θ2tθ2t+ε, Zvθ2t+ε is available, and an artificial elastic deformation problem can be set up from the arbitrary known reference domain Ωrefθ2ref to the current domain Ωθ2t+ε. The prescribed displacements uvrefRnv×2 for the elastic deformation problem are given as:

uvref=Zvθ2t+εZvrefθ2ref.E21

The entire mesh is moved and the new nodal coordinates can be determined as:

Zθ2t+ε=Zrefθ2ref+uref,E22

where urefRn×2 represents the displacement of all the nodes from the reference domain to the current domain.

The displacements in the elastic deformation step can cause distortions in the mesh, especially in regions where large deformation is expected, i.e., near the piping zone boundary. To maintain a good mesh quality for computation purposes, a mesh moving method [19] is used. The simple idea is to scale the elastic modulus Eeref of each element (in the reference domain) with the determinant of the Jacobian Jeref in the elastic deformation step:

Eeref=Eeref1/Jerefχref,E23

where χref is an arbitrary positive scaling parameter. The Poisson’s ratio of the reference domain νref is also chosen arbitrarily. The performance of the algorithm has been shown [20] to be invariant to the choice of these reference parameters. The net effect of such scaling is that small elements become rigid and larger elements become more flexible. Hence, if the reference domain mesh is constructed carefully such that small elements are placed in regions where large distortion is expected, i.e., near the piping zone, and larger elements are placed in regions of less expected distortion, the method is expected to yield a good mesh quality in the elastic deformation stage. The map in Eq. (23) is differentiable and helps determine Zθθ.

3.3 Gradient computation

Analytical methods for the computation of the gradient are intrusive and involve gradients of the steady seepage flow forward solver. From the definition of φθ=logpθy, it is apparent that the computation of φθθ requires the computation of mθ=hθqθ. These terms can be computed by taking the derivative of Eq. (2) and is given as:

Kθh+Khθ=qθ.E24

Given standard boundary conditions, this equation can be solved to obtain hθ and qθ, provided Kθ is known.

Following the standard procedure in FEM, the hydraulic conductivity matrix at the element level Ωe in the current domain is given as:

Ke=ΩeGTk̂θzGJedξ,E25

where k̂ represents the hydraulic conductivity spatial field obtained from the truncated K-L expansion, G contains the derivatives (w.r.t z) of the shape functions Nj described earlier in Section 2.3, Je is the determinant of the Jacobian matrix associated with the isoparametric transformation ξ1ξ2z1z2 and Ωe is the region occupied by the parent element related to the isoparametric transformation. The gradient of Ke with respect to the spatial parameters θ1 can be computed as:

Keθ1=ΩeGTk̂θ1GJedξ,E26

where the gradient of the hydraulic conductivity field w.r.t θ1 is easily obtained by differentiating Eq. (7) and is given as:

k̂θj1=λjϕj,E27

The gradient of Ke w.r.t the geometry parameters θ2 can be written as [6]:

Keθ2=ΩeGTθ2k̂GJe+GTk̂Gθ2Je+GTk̂GJeθ2+GTk̂θ2GJedξ.E28

Formulas for the calculation of the gradients Gθ2 and Jeθ2 can readily be found in literature [6]. It is clear from Eq. (6) that the computation of the eigenvalues and eigenfunctions depends on the definition of the domain Ωθ2. Hence, the gradient k̂θzθ2 will involve the gradients of the eigenvalues and eigenvectors and is written as:

k̂θ2=j=1K112λjλjθ2ϕj+λqϕjθ2θ1j.E29

The gradient of the eigenvalues and eigenvectors of the generalized eigenvalue problem in Eq. (14) w.r.t. θ2 are written as [21]:

λjθ2=djTBθ2λjMθ2dj,E30
djθ2=λjMBBθ2λjMθ2dj12djTMθ2djdj.E31

As dj lies in the null space of λjMB, the inverse λjMB1 cannot be computed and a generalized inverse called the Moore-Penrose pseudoinverse is employed. In this study, the Moore-Penrose pseudoinverse is calculated by first carrying out an SVD of the matrix λjMB and then eliminating the smallest singular values below a tolerance level. This is then followed by taking a standard inverse of the SVD.

Considering the same mesh that is used for the solution of the steady seepage flow problem to be used for the discretization of the random field, the gradients of the matrices B and M at an elemental level in isoparametric space are given as:

Bijθ2=ΩeNiΩeCzzNjJezθ2Jez+JezJezθ2dξdξ+ΩeNiΩeCzzθ2NjJezJezdξdξ,E32
Mijθ2=ΩeNiNjJez)θ2dξ.E33

In this study, the squared exponential autocorrelation coefficient function has been used, i.e.,

ρzz=expzz2lc2,E34

where lc is the correlation length of the random field. Assuming σz=σz=σ, the gradient of the autocovariance function Czz w.r.t θ2 is given as:

Czzθ2=2lc2zθ2zθ2TzzCzz.E35

This completes the definition of all the terms required to compute the HMC gradient

φθθ.
Advertisement

4. Numerical implementation and results

4.1 Observation data

A seepage zone containing a piping region of length l and width w, shown in Figure 1, is chosen to generate synthetic observations for inversion. A steady seepage flow problem is solved on the domain defined by l=0.15 m and w=0.05 m. The left and right boundaries marked in blue are Dirichlet boundaries and the top and bottom boundaries are no-flow Neumann boundaries. Linear shape functions are used in the FE solution of the forward problem. Ten sets of observations (hydraulic head and total outward normal flux) are made by increasing the hydraulic head at the left boundary as shown by point A in Figure 2. The hydraulic head at the right boundary is fixed at 0. The corresponding hydraulic head recorded at observation points B, C, D, and E are shown in Figure 2. The total outward normal flux from the right boundary is summed and also shown on the right in Figure 2. Observation data is generated assuming a constant hydraulic conductivity field, i.e., ktruez=0.001 m/s. The standard deviation of the observation noise for the hydraulic head data and outward normal flux data is taken to be 1 and 5% respectively.

Figure 1.

Discretized seepage domain containing piping zone, used to obtain observation data, i.e., hydraulic head h data at points B, C, D and E, and total outward normal flux data q from the boundary marked in blue on the right. The hydraulic head is constant on the left boundary. The number of nodes in the discretization is 3943.

Figure 2.

Observation data generated from numerical seepage flow experiment. Data at each time t is obtained by solving a new seepage flow experiment with an input hydraulic head at the left boundary represented by A.

4.2 Inversion setup

Considering a log-normal hydraulic conductivity field kz, inversion is carried out on the Gaussian field kz=logkzk˘, where k˘ is a lower bound taken as 105 m/s. The log-normal construction enables a convenient choice of the standard deviation of the random field which is chosen as σz=1. The correlation length of the random field is assumed to be known and is taken to be equal to the length of the largest dimension of the domain i.e. lc=0.4m.

A preliminary assessment of the eigenvalues obtained by solving the generalized eigenvalue problem in Eq. (14) reveals that the K-L expansion can be truncated at K1=12 terms. The eigenvalues are found to decay by approximately 3×104 times. This enables the determination of the number of terms in the spatial parameter vector θ1=θ1θ12. As mentioned in Section 3.2, once the geometry parameterization θ2=θ12θ22=θ13θ14 is known (see Figure 3(a)), an explicit function Zvθ2 can be constructed as:

Figure 3.

(a) Parameterizations of the piping zone boundary. Discretization of the reference domain defined by θ1ref2θ2ref2=0.15m0.05m with (b) 307 (c) 1476 and (d) 3018 nodes. Red dots indicate the position of observation points.

zvj1=L1θ12+j1θ12n11L2θ22,E36
zvj2=L1θ12L2θ22+j1θ22n21.E37

Eqs. (37) and (38) can be readily differentiated to obtain Zvθ, which can be used to compute other gradients Zθ, φθ etc. This completes the definition of the parameter vector.

As all updates are designed to take place from an arbitrary reference domain as mentioned in Eq. (23), the study of the convergence properties requires a focus on the reference mesh. The reference domain is arbitrarily chosen as the true domain used to generate the observations. Seven different reference configuration meshes with 128, 306, 497, 1038, 1476, 1860, and 3018 nodes are considered for the study of the convergence properties. Three of these meshes are shown in Figure 3. The artificial elastic properties selected for the mesh moving method are Eref=25MPa, νref=0.25 and χref=1. To reduce mesh distortion during the mesh moving stage, all the reference meshes have a unique construction, i.e., smaller elements (which behave rigidly) are placed close to the piping zone boundary and larger elements (which are more flexible) are placed further away.

4.3 Inversion and convergence analysis

There are no analytical solutions to verify the correctness of the gradient computation procedure mentioned in Section 3. Hence, for verification purposes, a short inversion analysis is done using HMC, where i=150 samples are drawn from the posterior. The number of leapfrog steps is variable and drawn from a Gaussian LN52. The prior for the parameters are chosen as θ1N0I12 and θ2N00.1I2. The results presented in Figure 4 correspond to inversion done considering the reference mesh shown in Figure 3(b). The potential energy φ decreases rapidly in the first 20 steps and thereafter HMC begins the exploration of the region of the high posterior probability. Prior experience leads the authors to consider the performance of HMC to be appropriate and as such, is an indirect indicator that the gradient computation procedure is correct.

Figure 4.

Results from HMC showing (a) Decrease in potential energy as HMC progresses towards the high posterior probability region and samples of (b) θ13 and (c) θ14. The dashed lines represent the true values of the parameters used to generate the observations.

In order to study the convergence properties, suitable error measures are required. The rate of convergence of the truncated KL expansion of the prior random field, for a fixed number of parameters K1=12, is determined through the relative mean error variance [14]:

εVar,rel=ε¯σ,KLε¯σ,analyticε¯σ,analytic.E38

The analytical mean error variance ε¯σ,analytic cannot be computed for squared exponential type autocorrelation coefficient functions (Eq. (35)) and is calculated numerically using the fine mesh containing 3943 nodes, as shown in Figure 1. The relative error is computed for the seven different meshes mentioned in Section 4.2 and plotted in a log–log plot in Figure 5. A closer look at Eq. (10) indicates that the mean error variance is dependent only on the cumulative sum of the K1 eigenvalues. As the eigenvalues are arranged in descending order, the relative error of the largest eigenvalue λ1 is also shown in Figure 5. The corresponding relative error measure can be written in a generalized manner as:

Figure 5.

Convergence of relative errors for different variables for a fixed number of terms K1=12 in the K-L expansion.

ελ1=λ1λ1analyticλ1analytic.E39

Other error measures can be computed in a similar manner just by replacing λ1 with the relevant variable. Finally, the relative error of the gradient of the largest eigenvalue w.r.t the geometry parameters λ1θ2 is also shown in Figure 5. To compute the error related to the gradient, one mesh moving step is carried out, where the mesh is moved from the reference domain θ1ref2θ2ref2=0.15m0.05m to an arbitrary domain defined by θ12θ22=0.16m0.04m. The same linear shape functions are used for discretization of the random field, the mesh moving method and the solution of the forward problem. The same realization of the parameter vector θ1 is used to generate the random field in all the cases. A least squares fit of the relative error plots with a first-order polynomial reveals a linear convergence rate between −2.5 and − 5. Linear rates of convergence for εVar,rel, using linear FEM, are also observed by Betz et al. [14].

Finally, the rate of convergence of the gradient of the potential energy function w.r.t the spatial parameter related to the largest eigenvalue θ1, and the two geometry parameters θ13 and θ14 is plotted in Figure 6. The convergence rates of the different gradients are almost parallel to each other. Once again, the slope of each plot is determined by a least squares fit with a first-order polynomial. Each plot shows a linear convergence rate with a slope of approximately −2.7.

Figure 6.

Convergence of relative errors of potential energy gradients for a fixed number of terms K1=12 in the K-L expansion.

Advertisement

5. Conclusions

This paper details the numerical procedure for the computation of gradients in probabilistic inverse problems involving the simultaneous estimation of spatial fields and geometry. The method is analytical (in the sense of [6]), intrusive and involves the computation of gradients of the forward problem. Emphasis is laid upon the calculation of gradient of eigenvalues and eigenvectors involved with the truncated K-L expansion method for discretization of the random field. The eigenvalues and eigenvectors are obtained by solving a generalized eigenvalue problem on a defined domain. This implies that as the geometry parameters are updated, the domain is updated and the generalized eigenvalues and eigenvectors change. Computation of the gradient of the eigenvectors w.r.t the geometry parameters involve the computation of a generalized inverse.

The gradients are validated through an inverse analysis using HMC. The potential energy decreases rapidly as the Markov chain related to the geometry parameters approaches the region of high posterior probability, indicating the correctness of the computed gradients. Overall the rate of convergence of various quantities is observed to be linear on a log–log plot. This means computation costs can grow exponentially to achieve better results. While the same mesh that is used to solve the forward problem can also be used for the Galerkin FE method-based discretization of the IEVP, the repeated need to solve the forward problem, the mesh moving method and the generalized eigenvalue problem at every step can be computationally prohibitive. Elimination of any one, two or even all three of these numerical problems can significantly improve the computational feasibility of simultaneous spatial field and geometry detection using gradient-based stochastic samplers like HMC.

Advertisement

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number JP21H02304.

Advertisement

Conflict of interest

The authors declare no conflict of interest.

References

  1. 1. Aster RC, Borchers B, Thurber CH. Parameter Estimation and Inverse Problems. Oxford: Academic Press; 2013. DOI: 10.1016/C2009-0-61134-X
  2. 2. Neal R. MCMC Using Hamiltonian Dynamics. Handb. Markov Chain Monte Carlo. New York: CRC Press; 2011. DOI: 10.1201/b10905-6
  3. 3. Voorhees A, Millwater H, Bagley R. Complex variable methods for shape sensitivity of finite element models. Finite Elements in Analysis and Design. 2011;47:1146-1156. DOI: 10.1016/J.FINEL.2011.05.003
  4. 4. Lee J, Kitanidis PK. Bayesian inversion with total variation prior for discrete geologic structure identification. Water Resources Research. 2013;49:7658-7669. DOI: 10.1002/2012WR013431
  5. 5. Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena. 1992;60:259-268. DOI: 10.1016/0167-2789(92)90242-F
  6. 6. Christensen PW, Klarbring A. Two-Dimensional Shape Optimization. An Introd. to Struct. Optim. Netherlands: Springer; 2008. DOI: 10.1007/978-1-4020-8666-3_7
  7. 7. Haslinger J, Mäkinen RA. Introduction to Shape Optimization-Theory, Approximation, and Computation. Philadelphia: SIAM; 2003
  8. 8. Koch MC, Osugi M, Fujisawa K, Murakami A. Hamiltonian Monte Carlo for simultaneous interface and spatial field detection (HMCSISFD) and its application to a piping zone interface detection problem. International Journal for Numerical and Analytical Methods in Geomechanics. 2021;45:2602-2626. DOI: 10.1002/NAG.3279
  9. 9. Koch MC, Fujisawa K, Murakami A. Adjoint Hamiltonian Monte Carlo algorithm for the estimation of elastic modulus through the inversion of elastic wave propagation data. International Journal for Numerical Methods in Engineering. 2020;121:1037-1067. DOI: 10.1002/nme.6256
  10. 10. Koch MC, Fujisawa K, Murakami A. Novel parameter update for a gradient based MCMC method for solid-void interface detection through elastodynamic inversion. Probabilistic Engineering Mechanics. 2020;62:103097. DOI: 10.1016/j.probengmech.2020.103097
  11. 11. Loève M. Probability Theory II. New York: Springer-Verlag; 1978
  12. 12. Ghanem RG, Spanos PD. Stochastic Finite Elements: A Spectral Approach. Springer New York: New York, NY; 1991. DOI: 10.1007/978-1-4612-3094-6
  13. 13. Sudret B, Der Kiureghian A. Stochastic Finite Element Methods and Reliability: A State-of-the-Art Report. UCB/SEMM-2000/08. Berkeley: University of California; 2000
  14. 14. Betz W, Papaioannou I, Straub D. Numerical methods for the discretization of random fields by means of the Karhunen–Loève expansion. Computer Methods in Applied Mechanics and Engineering. 2014;271:109-129. DOI: 10.1016/J.CMA.2013.12.010
  15. 15. Marzouk YM, Najm HN. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. Journal of Computational Physics. 2009;228:1862-1902. DOI: 10.1016/j.jcp.2008.11.024
  16. 16. Leimkuhler B, Reich S. Simulating Hamiltonian Dynamics. Cambridge: Cambridge University Press; 2005. DOI: 10.1017/CBO9780511614118
  17. 17. Betancourt M. A conceptual introduction to hamiltonian Monte Carlo. ArXiv Preprint. 2017
  18. 18. Andrieu C, De Freitas N, Doucet A, Jordan MI. An introduction to MCMC for machine learning. Machine Learning. 2003:5-43. DOI: 10.1023/A:1020281327116
  19. 19. Stein K, Tezduyar T, Benney R. Mesh moving techniques for fluid-structure interactions with large displacements. Journal of Applied Mechanics. 2003;70:58-63. DOI: 10.1115/1.1530635
  20. 20. Koch MC, Osugi M, Fujisawa K, Murakami A. Investigation of reference parameter invariance and application of HMCSISFD for identification of an eroded seepage zone. Proc. 13th Int. Conf. Struct. Saf. Reliab. 2021–2022, Shanghai. 2022
  21. 21. Magnus JR. On Differentiating Eigenvalues and Eigenvectors. Econometric Theory. 1985;1:179-191. DOI: 10.1017/S0266466600011129

Written By

Michael Conrad Koch, Kazunori Fujisawa and Akira Murakami

Submitted: 12 September 2022 Reviewed: 29 September 2022 Published: 01 November 2022