Numerical Gradient Computation for Simultaneous Detection of Geometry and Spatial Random Fields in a Statistical Framework

Michael Conrad Koch; Kazunori Fujisawa; Akira Murakami

doi:10.5772/intechopen.108363

Abstract

The target of this chapter is the evaluation of gradients in inverse problems where spatial field parameters and geometry parameters are treated separately. Such an approach can be beneficial especially when the geometry needs to be detected accurately using L2-norm-based regularization. Emphasis is laid upon the computation of the gradients directly from the governing equations. Working in a statistical framework, the Karhunen-Loève (K-L) expansion is used for discretization of the spatial random field and inversion is done using the gradient-based Hamiltonian Monte Carlo (HMC) algorithm. The HMC gradients involve sensitivities w.r.t the random spatial field and geometry parameters. Building on a method developed by the authors, a procedure is developed which considers the gradients of the associated integral eigenvalue problem (IEVP) as well as the interaction between the gradients w.r.t random spatial field parameters and the gradients w.r.t the geometry parameters. The same mesh and linear shape functions are used in the finite element method employed to solve the forward problem, the artificial elastic deformation problem and the IEVP. Analysis of the rate of convergence using seven different meshes of increasing density indicates a linear rate of convergence of the gradients of the log posterior.

Keywords

sensitivity analysis
geometry detection
random fields
Hamiltonian Monte Carlo
inverse problems

Author Information

Show +

Michael Conrad Koch*
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
Kazunori Fujisawa
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan
Akira Murakami
- Graduate School of Agriculture, Kyoto University, Kyoto, Japan

*Address all correspondence to: koch.michaelconrad.5w@kyoto-u.ac.jp

1. Introduction

Accurate computation of gradients, w.r.t parameters of interest, is a key aspect of deterministic algorithms like Gauss-Newton, Levenberg–Marquardt, Occam’s inversion [1] as well as statistical algorithms like Hamiltonian Monte Carlo (HMC) [2]. Common nonintrusive methods like finite differences compute the gradient by taking differences between the response at the current model and at a perturbed model, such methods suffer from certain drawbacks. Two types of errors stand out in particular: numerical error involved with truncation of Taylor’s series and round-off error involved with finite precision arithmetic of computers [3]. For more robust analysis, this chapter focuses on methods where the gradient is computed directly from the analytical/numerical model by enlisting the sensitivity equations.

The type of solutions that can be obtained from inverse problems is guided by the regularization term. When the accurate detection of the geometry of embedded objects (or detection of discontinuities) is of interest, L1-norm-based difference priors have proven to be useful [4, 5]. However, special techniques need to be used to accommodate the nondifferentiability of the L1-norm [1]. Hence, to make use of a large library of adjoint-based inversion solvers (that use gradients w.r.t the parameters), L2-norm-based priors which readily allow for differentiation are more popular in practice. However, L2-norm regularization or Gaussian priors in a stochastic sense only admit smooth solutions. Hence, in order to still be able to capture discontinuities, this paper explicitly parameterizes the shape of the boundary (or the geometry of the domain). This approach thereby considers two sets of parameters, one related to the spatial field and the second with the geometry parameters. This approach is of course only applicable when the unknown geometry can be parameterized explicitly.

Gradients have to be computed w.r.t both spatial as well geometry parameters. While sensitivity analysis of spatial parameters is usually straightforward, computation of the gradients w.r.t geometry parameters [6, 7] needs to be done more carefully and consider aspects like mesh distortion. This chapter develops on the method presented in [8] and briefly details the simultaneous spatial field and geometry update within the HMC statistical framework. Similar to [8], the Karhunen-Loève (K-L) expansion is used for discretization of the random field, with the difference that the complete theoretical basis is considered. The complete integral eigenvalue problem or IEVP is solved and the associated gradients are computed. The interaction between the geometry and spatial parameters due to the domain of definition of the IEVP is also detailed. It should be noted that the procedure detailed above is applicable to both the direct differentiation and adjoint methods [9] of sensitivity analysis.

The entire numerical study is done with a focus on aspects related to gradients and not with the aim to solve the inverse problem. Nevertheless, a forward model is still required for the computation of gradients. The chapter begins with a description of the forward model, observation equation and the discretization of the spatial random field in Section 2. This is followed by a brief description in Section 3 of the HMC-based methods developed in [8, 10] for simultaneous spatial field and geometry detection. Section 3.3 introduces the new gradients obtained when the complete IEVP is considered. It is shown how the gradients of the eigenvectors involve the computation of a Moore-Penrose pseudoinverse. Finally, the gradient computation procedure is validated and a convergence study is done in Section 4.

2. Inversion preliminaries

2.1 Governing and observation equations

Consider a linear steady seepage flow problem defined on a domain z∈Ω⊂Rd, d∈23, where kz is a symmetric spatially varying hydraulic conductivity matrix, hz is the hydraulic head, and Qz is a source term as shown in Eq. (1) below

∇.−kz∇hz=Qz.E1

Standard Dirichlet: h=h¯ on ΓD, and Neumann: fn=f.n=−k∇h.n on ΓN boundary conditions are applied. Following a spatial discretization of the weak form of the PDE using the finite element method, the governing equation can be written as:

Kθh=q,E2

where K is the global hydraulic conductivity matrix and h and q are the nodal hydraulic head and flux vectors respectively. The parameter vector θ=θ1θ2∈RK in Eq. (2), includes all the unknowns related to the inverse problem. In this study, the unknowns are divided into two sets [8]: θ1∈RK1, related to the spatial discretization of a random field and θ2∈RK2, and related to the definition of the geometry of the domain Ω. Also, consider an observation model relating the state vector m=hq to the discrete observations y, through a map H that is independent of θ i.e.

y=Hmθ+r.E3

The error in Eq. (3) is modeled as Gaussian r∼N0R with a known covariance matrix R. Parameter estimation is done in a probabilistic sense using Bayesian inference. Starting with a Gaussian prior distribution pθ=Nθ0,Σθ and a likelihood distribution pyθ=NyHmθ,R, the posterior distribution is written as:

pθy∝pyθpθ.E4

Except for linear Gaussian observation models, the posterior cannot be computed analytically and is usually evaluated using MCMC sampling algorithms.

2.2 Karhunen-Loève (K-L) expansion

The parameter vector θ1 defined in Section 2.1 is associated with a continuous hydraulic conductivity spatial random field kzω, where z is defined on the domain Ω and ω belongs to the space of random events Θ. Let the expected value of the random field be denoted as k¯:Ω→R and the autocovariance function C:Ω×Ω→R be defined as Czz′=σzσz′ρzz′. Here σ:Ω→R is the standard deviation function and ρ:Ω×Ω→−11 is the autocorrelation coefficient function. The study in this chapter is confined to Gaussian random fields that can be defined completely by their mean and autocovariance functions.

The Karhunen-Loève (K-L) expansion method is a series expansion method for the discretization of random fields which is based on the spectral decomposition of the autocovariance function. It can be shown that a random field can be written as an infinite sum [11]:

kzω=k¯z+∑k=1∞λkθ1kϕkz,E5

where θ1k:Θ→R are standard uncorrelated random variables, λk are the eigenvalues (always non-negative) and ϕkz are the eigenfunctions of the linear operator related to the covariance kernel C. They can be obtained by solving the homogeneous Fredholm integral eigenvalue problem (IEVP) on the domain Ω:

∫ΩCzz′ϕkz′dz′=λkϕkz.E6

The autocovariance function is symmetric, bounded, and positive semi-definite and has the spectral decomposition Czz′=∑k=1∞λkϕkzϕkz′. The eigenfunctions are orthogonal, and in a normalized form satisfy the condition ∫Ωϕkzϕlzdz=δkl, where δkl is the Kronecker delta. In the case of Gaussian random fields, the random variables θ1k are also independent and follow the standard normal distribution.

In practice, the eigenvalues decay exponentially fast for smooth functions and algebraically fast for non-smooth autocovariance kernels and the K-L expansion is usually truncated after K1 terms. If the eigenvalues are arranged in descending order such that λ1>λ2>…>λK1, then accompanied by the associated eigenfunctions, the truncated K-L expansion approximation of the random field can be written as

k̂zω=k¯z+∑k=1K1λkθ1kϕkz,E7

The truncated K-L expansion approximation is optimal in the sense that, for a fixed number of terms K1, the mean square error over the domain is minimized [12]. A global error measure related to random field discretization is called the mean error variance ε¯σ and is defined as [13]:

ε¯σz=1Ω∫ΩVarkzω−k̂zωVarkzωdz.E8

It can be shown that the variance of the truncated K-L expansion k̂zω is:

Vark̂zω=∑k=1K1λkϕk2z.E9

Using the property Eθ1kθ1l=δkl, the mean error variance can be calculated as [14]:

ε¯σ,KL=1−1Ωσ2∑k=1K1λk.E10

The derivation for Eq. (10) assumes the random field to be homogeneous, i.e., σz=σ. We only consider the case where the prior random field is homogeneous and Gaussian. This assumption is for numerical convenience and does not limit the posterior, which can be non-Gaussian and non-homogeneous [15].

2.3 Galerkin finite element method to solve the integral eigenvalue problem

The Galerkin Finite Element Method (FEM) is used to solve the IEVP on Ω. The eigenfunctions are approximated with the help of the shape functions Nj:Ω→R of the FE mesh, and is represented as:

ϕkz≈∑j=1ndkjNjz,E11

where the coefficients dkj∈R are unknown and n is the number of nodes in the FE mesh. Substitution of Eq. (11) into Eq. (6), yields the residual:

rz=∑j=1ndkj∫ΩCzz′Njz′dz′−λjNjz,E12

In the Galerkin method, the unknown coefficients are determined by making the residual rz orthogonal to the space spanned by the shape functions i.e.

∫ΩrzNiz′dz′=0∀j=1,…,n.E13

This results in a generalized eigenvalue problem

Bdk=λkMdk,E14

where

Bij=∫ΩNiz∫ΩCzz′Njz′dz′dzand

Mij=∫ΩNizNjzdz.E15

Both B and M are n×n matrices that involve integrals over the domain Ω. Hence the actual geometry of the domain has to be considered for integration. The maximum number of available eigenpairs is n, but in practice, the K-L expansion can usually be truncated at K1 terms such that K1≪n. As such, for computational efficiency, it is sufficient to compute the first K1 eigenpairs only, which can be done through the Lanczos algorithm.

3. Simultaneous geometry and spatial field detection

3.1 Hamiltonian Monte Carlo

Consider a parameter space θ∈RK augmented with equidimensional momentum variables p∈RK and a joint probability distribution with density pθp defined over this augmented space. If the underlying distribution over the momentum variables is chosen to be a Gaussian: pp≡Np0,M, where M is user-specified and independent of θ, then a joint probability distribution can be defined as pθp=pppθy. The Hamiltonian H:RK×RK→R is then defined as:

Hθp=−logpp−logpθyE16

The second term on the right of Eq. (17) is generally called φ:RK→R and is given as φθ=−logpθy.

The introduction of the momentum variables allows for the generation of trajectories through conservative Hamiltonian dynamics [16], which are given as:

dθdtdpdt=01−10∂H∂θ∂H∂p=−∂φθ∂θM−1p.E17

These dynamics are exactly reversible (provided the gradient ∂φθ∂θ is one-to-one) and preserve volume as Eq. (18) is just a rotation transformation in θ−p space. Except for simple problems Eq. (18) cannot be solved analytically and is usually solved using the leapfrog method, which is a second-order accurate numerical integrator given as:

pt+ϵ2=pt−ϵ2∂φθt∂θ,E18

θt+ϵ=θt+ϵM−1pt+ϵ2andE19

pt+ϵ=pt+ϵ2−ϵ2∂φθt+ε∂θ.E20

Starting from a point θjpj, these equations are applied repeatedly for L steps, each with a step-size ϵ, to determine a transition to a new point θj+1pj+1, which lies on the same Hamiltonian level-set as θjpj. The deterministic part of Hamiltonian Monte Carlo (HMC) [2] is defined by Eqs. (19)–(21). The stochastic part of HMC comes from resampling p∼N0M. The statistical efficiency of Hamiltonian Monte Carlo stems from the fact that the gradient-guided transitions can propose new points that are “far-away” from the starting point, thereby enabling efficient sampling of the posterior. This is in contrast to the random nature of transitions, which suffer from the curse of dimensionality [17], in conventional MCMC algorithms. As shown in Eqs. (19)–(21), critical to the success of HMC, is the computation of the gradient ∂φθ∂θ. Special attention must be paid to maintaining the reversibility of the transitions to satisfy the detailed balance condition [18] for MCMC algorithms. This is detailed along with the gradient computation procedure in the following sections.

3.2 Parameter update using the mesh moving method

The leapfrog equations determine an update in θ−p space. In particular, the update from θt→θt+ϵ in Eq. (20), defines not only a new realization of the random field, but also a new domain, i.e., Ωθ2. Without loss of generality, consider a domain in 2D discretized such that Zθ2∈R2 is the nodal coordinate vector of all the nodes of the finite element mesh. Let Zvθ2∈R2 represent a subset of this vector that includes only the node coordinates at the piping zone boundary. Koch et al. [10] show that the computation of the gradient ∂φθ∂θ, by analytical methods [6], ultimately involves the computation of the gradient of the nodal coordinate vector ∂Zθ∂θ.

Computation of the nodal coordinate vector gradient requires the definition of a differentiable map which is additionally reversible and one-to-one, to satisfy the detailed balance condition of MCMC. One such map proposed in [10], is to update from an arbitrarily fixed reference domain Ωrefθ2ref, defined by an arbitrary parameter θ2ref. Let Zrefθ2ref∈Rn×2 be the nodal coordinate vector on the discretized domain Ωref and Zvrefθ2ref∈Rnv×2 represent a subset of this vector that includes only the coordinates of the nv nodes at the piping zone boundary. The nodal coordinates Zvθ2 and Zvrefθ2ref can be determined explicitly if θ2 and θ2ref are known respectively. Following the update, θ2t→θ2t+ε, Zvθ2t+ε is available, and an artificial elastic deformation problem can be set up from the arbitrary known reference domain Ωrefθ2ref to the current domain Ωθ2t+ε. The prescribed displacements uvref∈Rnv×2 for the elastic deformation problem are given as:

uvref=Zvθ2t+ε−Zvrefθ2ref.E21

The entire mesh is moved and the new nodal coordinates can be determined as:

Zθ2t+ε=Zrefθ2ref+uref,E22

where uref∈Rn×2 represents the displacement of all the nodes from the reference domain to the current domain.

The displacements in the elastic deformation step can cause distortions in the mesh, especially in regions where large deformation is expected, i.e., near the piping zone boundary. To maintain a good mesh quality for computation purposes, a mesh moving method [19] is used. The simple idea is to scale the elastic modulus Eeref of each element (in the reference domain) with the determinant of the Jacobian Jeref in the elastic deformation step:

E∼eref=Eeref1/Jerefχref,E23

where χref is an arbitrary positive scaling parameter. The Poisson’s ratio of the reference domain νref is also chosen arbitrarily. The performance of the algorithm has been shown [20] to be invariant to the choice of these reference parameters. The net effect of such scaling is that small elements become rigid and larger elements become more flexible. Hence, if the reference domain mesh is constructed carefully such that small elements are placed in regions where large distortion is expected, i.e., near the piping zone, and larger elements are placed in regions of less expected distortion, the method is expected to yield a good mesh quality in the elastic deformation stage. The map in Eq. (23) is differentiable and helps determine ∂Zθ∂θ.

3.3 Gradient computation

Analytical methods for the computation of the gradient are intrusive and involve gradients of the steady seepage flow forward solver. From the definition of φθ=−logpθy, it is apparent that the computation of ∂φθ∂θ requires the computation of ∂m∂θ=∂h∂θ∂q∂θ. These terms can be computed by taking the derivative of Eq. (2) and is given as:

∂K∂θh+K∂h∂θ=∂q∂θ.E24

Given standard boundary conditions, this equation can be solved to obtain ∂h∂θ and ∂q∂θ, provided ∂K∂θ is known.

Following the standard procedure in FEM, the hydraulic conductivity matrix at the element level Ωe in the current domain is given as:

Ke=∫ΩeGTk̂θzGJedξ,E25

where k̂ represents the hydraulic conductivity spatial field obtained from the truncated K-L expansion, G contains the derivatives (w.r.t z) of the shape functions Nj described earlier in Section 2.3, Je is the determinant of the Jacobian matrix associated with the isoparametric transformation ξ1ξ2→z1z2 and Ωe is the region occupied by the parent element related to the isoparametric transformation. The gradient of Ke with respect to the spatial parameters θ1 can be computed as:

∂Ke∂θ1=∫ΩeGT∂k̂∂θ1GJedξ,E26

where the gradient of the hydraulic conductivity field w.r.t θ1 is easily obtained by differentiating Eq. (7) and is given as:

∂k̂∂θj1=λjϕj,E27

The gradient of Ke w.r.t the geometry parameters θ2 can be written as [6]:

∂Ke∂θ2=∫Ωe∂GT∂θ2k̂GJe+GTk̂∂G∂θ2Je+GTk̂G∂Je∂θ2+GT∂k̂∂θ2GJedξ.E28

Formulas for the calculation of the gradients ∂G∂θ2 and ∂Je∂θ2 can readily be found in literature [6]. It is clear from Eq. (6) that the computation of the eigenvalues and eigenfunctions depends on the definition of the domain Ωθ2. Hence, the gradient ∂k̂θz∂θ2 will involve the gradients of the eigenvalues and eigenvectors and is written as:

∂k̂∂θ2=∑j=1K112λj∂λj∂θ2ϕj+λq∂ϕj∂θ2θ1j.E29

The gradient of the eigenvalues and eigenvectors of the generalized eigenvalue problem in Eq. (14) w.r.t. θ2 are written as [21]:

∂λj∂θ2=djT∂B∂θ2−λj∂M∂θ2dj,E30

∂dj∂θ2=λjM−B†∂B∂θ2−λj∂M∂θ2dj−12djT∂M∂θ2djdj.E31

As dj lies in the null space of λjM−B, the inverse λjM−B−1 cannot be computed and a generalized inverse called the Moore-Penrose pseudoinverse ∙† is employed. In this study, the Moore-Penrose pseudoinverse is calculated by first carrying out an SVD of the matrix λjM−B and then eliminating the smallest singular values below a tolerance level. This is then followed by taking a standard inverse of the SVD.

Considering the same mesh that is used for the solution of the steady seepage flow problem to be used for the discretization of the random field, the gradients of the matrices B and M at an elemental level in isoparametric space are given as:

∂Bij∂θ2=∫ΩeNi∫ΩeCzz′Nj∂Jez′∂θ2Jez′+Jez∂Jez′∂θ2dξ′dξ+∫ΩeNi∫Ωe∂Czz′∂θ2NjJezJez′dξ′dξ,E32

∂Mij∂θ2=∫ΩeNiNj∂Jez)∂θ2dξ.E33

In this study, the squared exponential autocorrelation coefficient function has been used, i.e.,

ρzz′=exp−z−z′2lc2,E34

where lc is the correlation length of the random field. Assuming σz=σz′=σ, the gradient of the autocovariance function Czz′ w.r.t θ2 is given as:

∂Czz′∂θ2=−2lc2∂z∂θ2−∂z∂θ2Tz−z′Czz′.E35

This completes the definition of all the terms required to compute the HMC gradient

∂φθ∂θ.

4. Numerical implementation and results

4.1 Observation data

A seepage zone containing a piping region of length l and width w, shown in Figure 1, is chosen to generate synthetic observations for inversion. A steady seepage flow problem is solved on the domain defined by l=0.15 m and w=0.05 m. The left and right boundaries marked in blue are Dirichlet boundaries and the top and bottom boundaries are no-flow Neumann boundaries. Linear shape functions are used in the FE solution of the forward problem. Ten sets of observations (hydraulic head and total outward normal flux) are made by increasing the hydraulic head at the left boundary as shown by point A in Figure 2. The hydraulic head at the right boundary is fixed at 0. The corresponding hydraulic head recorded at observation points B, C, D, and E are shown in Figure 2. The total outward normal flux from the right boundary is summed and also shown on the right in Figure 2. Observation data is generated assuming a constant hydraulic conductivity field, i.e., ktruez=0.001 m/s. The standard deviation of the observation noise for the hydraulic head data and outward normal flux data is taken to be 1 and 5% respectively.

Figure 1.
Discretized seepage domain containing piping zone, used to obtain observation data, i.e., hydraulic head h data at points B, C, D and E, and total outward normal flux data q from the boundary marked in blue on the right. The hydraulic head is constant on the left boundary. The number of nodes in the discretization is 3943.

Figure 2.
Observation data generated from numerical seepage flow experiment. Data at each time t is obtained by solving a new seepage flow experiment with an input hydraulic head at the left boundary represented by A.

4.2 Inversion setup

Considering a log-normal hydraulic conductivity field kz, inversion is carried out on the Gaussian field k∼z=logkz−k˘, where k˘ is a lower bound taken as 10−5 m/s. The log-normal construction enables a convenient choice of the standard deviation of the random field which is chosen as σz=1. The correlation length of the random field is assumed to be known and is taken to be equal to the length of the largest dimension of the domain i.e. lc=0.4m.

A preliminary assessment of the eigenvalues obtained by solving the generalized eigenvalue problem in Eq. (14) reveals that the K-L expansion can be truncated at K1=12 terms. The eigenvalues are found to decay by approximately 3×104 times. This enables the determination of the number of terms in the spatial parameter vector θ1=θ1…θ12. As mentioned in Section 3.2, once the geometry parameterization θ2=θ12θ22=θ13θ14 is known (see Figure 3(a)), an explicit function Zvθ2 can be constructed as:

Figure 3.
(a) Parameterizations of the piping zone boundary. Discretization of the reference domain defined by θ1ref2θ2ref2=0.15m0.05m with (b) 307 (c) 1476 and (d) 3018 nodes. Red dots indicate the position of observation points.

zvj1=L1−θ12+j−1θ12n1−1L2−θ22,E36

zvj2=L1−θ12L2−θ22+j−1θ22n2−1.E37

Eqs. (37) and (38) can be readily differentiated to obtain ∂Zv∂θ, which can be used to compute other gradients ∂Z∂θ, ∂φ∂θ etc. This completes the definition of the parameter vector.

As all updates are designed to take place from an arbitrary reference domain as mentioned in Eq. (23), the study of the convergence properties requires a focus on the reference mesh. The reference domain is arbitrarily chosen as the true domain used to generate the observations. Seven different reference configuration meshes with 128, 306, 497, 1038, 1476, 1860, and 3018 nodes are considered for the study of the convergence properties. Three of these meshes are shown in Figure 3. The artificial elastic properties selected for the mesh moving method are Eref=25MPa, νref=0.25 and χref=1. To reduce mesh distortion during the mesh moving stage, all the reference meshes have a unique construction, i.e., smaller elements (which behave rigidly) are placed close to the piping zone boundary and larger elements (which are more flexible) are placed further away.

4.3 Inversion and convergence analysis

There are no analytical solutions to verify the correctness of the gradient computation procedure mentioned in Section 3. Hence, for verification purposes, a short inversion analysis is done using HMC, where i=150 samples are drawn from the posterior. The number of leapfrog steps is variable and drawn from a Gaussian L∼N52. The prior for the parameters are chosen as θ1∼N0I12 and θ2∼N00.1I2. The results presented in Figure 4 correspond to inversion done considering the reference mesh shown in Figure 3(b). The potential energy φ decreases rapidly in the first 20 steps and thereafter HMC begins the exploration of the region of the high posterior probability. Prior experience leads the authors to consider the performance of HMC to be appropriate and as such, is an indirect indicator that the gradient computation procedure is correct.

Figure 4.
Results from HMC showing (a) Decrease in potential energy as HMC progresses towards the high posterior probability region and samples of (b) θ13 and (c) θ14. The dashed lines represent the true values of the parameters used to generate the observations.

In order to study the convergence properties, suitable error measures are required. The rate of convergence of the truncated KL expansion of the prior random field, for a fixed number of parameters K1=12, is determined through the relative mean error variance [14]:

εVar,rel=ε¯σ,KL−ε¯σ,analyticε¯σ,analytic.E38

The analytical mean error variance ε¯σ,analytic cannot be computed for squared exponential type autocorrelation coefficient functions (Eq. (35)) and is calculated numerically using the fine mesh containing 3943 nodes, as shown in Figure 1. The relative error is computed for the seven different meshes mentioned in Section 4.2 and plotted in a log–log plot in Figure 5. A closer look at Eq. (10) indicates that the mean error variance is dependent only on the cumulative sum of the K1 eigenvalues. As the eigenvalues are arranged in descending order, the relative error of the largest eigenvalue λ1 is also shown in Figure 5. The corresponding relative error measure can be written in a generalized manner as:

Figure 5.
Convergence of relative errors for different variables for a fixed number of terms K1=12 in the K-L expansion.

ελ1=λ1−λ1analyticλ1analytic.E39

Other error measures can be computed in a similar manner just by replacing λ1 with the relevant variable. Finally, the relative error of the gradient of the largest eigenvalue w.r.t the geometry parameters ∂λ1∂θ2 is also shown in Figure 5. To compute the error related to the gradient, one mesh moving step is carried out, where the mesh is moved from the reference domain θ1ref2θ2ref2=0.15m0.05m to an arbitrary domain defined by θ12θ22=0.16m0.04m. The same linear shape functions are used for discretization of the random field, the mesh moving method and the solution of the forward problem. The same realization of the parameter vector θ1 is used to generate the random field in all the cases. A least squares fit of the relative error plots with a first-order polynomial reveals a linear convergence rate between −2.5 and − 5. Linear rates of convergence for εVar,rel, using linear FEM, are also observed by Betz et al. [14].

Finally, the rate of convergence of the gradient of the potential energy function w.r.t the spatial parameter related to the largest eigenvalue θ1, and the two geometry parameters θ13 and θ14 is plotted in Figure 6. The convergence rates of the different gradients are almost parallel to each other. Once again, the slope of each plot is determined by a least squares fit with a first-order polynomial. Each plot shows a linear convergence rate with a slope of approximately −2.7.

Figure 6.
Convergence of relative errors of potential energy gradients for a fixed number of terms K1=12 in the K-L expansion.

5. Conclusions

This paper details the numerical procedure for the computation of gradients in probabilistic inverse problems involving the simultaneous estimation of spatial fields and geometry. The method is analytical (in the sense of [6]), intrusive and involves the computation of gradients of the forward problem. Emphasis is laid upon the calculation of gradient of eigenvalues and eigenvectors involved with the truncated K-L expansion method for discretization of the random field. The eigenvalues and eigenvectors are obtained by solving a generalized eigenvalue problem on a defined domain. This implies that as the geometry parameters are updated, the domain is updated and the generalized eigenvalues and eigenvectors change. Computation of the gradient of the eigenvectors w.r.t the geometry parameters involve the computation of a generalized inverse.

The gradients are validated through an inverse analysis using HMC. The potential energy decreases rapidly as the Markov chain related to the geometry parameters approaches the region of high posterior probability, indicating the correctness of the computed gradients. Overall the rate of convergence of various quantities is observed to be linear on a log–log plot. This means computation costs can grow exponentially to achieve better results. While the same mesh that is used to solve the forward problem can also be used for the Galerkin FE method-based discretization of the IEVP, the repeated need to solve the forward problem, the mesh moving method and the generalized eigenvalue problem at every step can be computationally prohibitive. Elimination of any one, two or even all three of these numerical problems can significantly improve the computational feasibility of simultaneous spatial field and geometry detection using gradient-based stochastic samplers like HMC.

Acknowledgments

This work was supported by JSPS KAKENHI Grant Number JP21H02304.

Conflict of interest

The authors declare no conflict of interest.

References

1. Aster RC, Borchers B, Thurber CH. Parameter Estimation and Inverse Problems. Oxford: Academic Press; 2013. DOI: 10.1016/C2009-0-61134-X
2. Neal R. MCMC Using Hamiltonian Dynamics. Handb. Markov Chain Monte Carlo. New York: CRC Press; 2011. DOI: 10.1201/b10905-6
3. Voorhees A, Millwater H, Bagley R. Complex variable methods for shape sensitivity of finite element models. Finite Elements in Analysis and Design. 2011;47:1146-1156. DOI: 10.1016/J.FINEL.2011.05.003
4. Lee J, Kitanidis PK. Bayesian inversion with total variation prior for discrete geologic structure identification. Water Resources Research. 2013;49:7658-7669. DOI: 10.1002/2012WR013431
5. Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena. 1992;60:259-268. DOI: 10.1016/0167-2789(92)90242-F
6. Christensen PW, Klarbring A. Two-Dimensional Shape Optimization. An Introd. to Struct. Optim. Netherlands: Springer; 2008. DOI: 10.1007/978-1-4020-8666-3_7
7. Haslinger J, Mäkinen RA. Introduction to Shape Optimization-Theory, Approximation, and Computation. Philadelphia: SIAM; 2003
8. Koch MC, Osugi M, Fujisawa K, Murakami A. Hamiltonian Monte Carlo for simultaneous interface and spatial field detection (HMCSISFD) and its application to a piping zone interface detection problem. International Journal for Numerical and Analytical Methods in Geomechanics. 2021;45:2602-2626. DOI: 10.1002/NAG.3279
9. Koch MC, Fujisawa K, Murakami A. Adjoint Hamiltonian Monte Carlo algorithm for the estimation of elastic modulus through the inversion of elastic wave propagation data. International Journal for Numerical Methods in Engineering. 2020;121:1037-1067. DOI: 10.1002/nme.6256
10. Koch MC, Fujisawa K, Murakami A. Novel parameter update for a gradient based MCMC method for solid-void interface detection through elastodynamic inversion. Probabilistic Engineering Mechanics. 2020;62:103097. DOI: 10.1016/j.probengmech.2020.103097
11. Loève M. Probability Theory II. New York: Springer-Verlag; 1978
12. Ghanem RG, Spanos PD. Stochastic Finite Elements: A Spectral Approach. Springer New York: New York, NY; 1991. DOI: 10.1007/978-1-4612-3094-6
13. Sudret B, Der Kiureghian A. Stochastic Finite Element Methods and Reliability: A State-of-the-Art Report. UCB/SEMM-2000/08. Berkeley: University of California; 2000
14. Betz W, Papaioannou I, Straub D. Numerical methods for the discretization of random fields by means of the Karhunen–Loève expansion. Computer Methods in Applied Mechanics and Engineering. 2014;271:109-129. DOI: 10.1016/J.CMA.2013.12.010
15. Marzouk YM, Najm HN. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. Journal of Computational Physics. 2009;228:1862-1902. DOI: 10.1016/j.jcp.2008.11.024
16. Leimkuhler B, Reich S. Simulating Hamiltonian Dynamics. Cambridge: Cambridge University Press; 2005. DOI: 10.1017/CBO9780511614118
17. Betancourt M. A conceptual introduction to hamiltonian Monte Carlo. ArXiv Preprint. 2017
18. Andrieu C, De Freitas N, Doucet A, Jordan MI. An introduction to MCMC for machine learning. Machine Learning. 2003:5-43. DOI: 10.1023/A:1020281327116
19. Stein K, Tezduyar T, Benney R. Mesh moving techniques for fluid-structure interactions with large displacements. Journal of Applied Mechanics. 2003;70:58-63. DOI: 10.1115/1.1530635
20. Koch MC, Osugi M, Fujisawa K, Murakami A. Investigation of reference parameter invariance and application of HMCSISFD for identification of an eroded seepage zone. Proc. 13th Int. Conf. Struct. Saf. Reliab. 2021–2022, Shanghai. 2022
21. Magnus JR. On Differentiating Eigenvalues and Eigenvectors. Econometric Theory. 1985;1:179-191. DOI: 10.1017/S0266466600011129

[1] 1. Aster RC, Borchers B, Thurber CH. Parameter Estimation and Inverse Problems. Oxford: Academic Press; 2013. DOI: 10.1016/C2009-0-61134-X

[2] 2. Neal R. MCMC Using Hamiltonian Dynamics. Handb. Markov Chain Monte Carlo. New York: CRC Press; 2011. DOI: 10.1201/b10905-6

[3] 3. Voorhees A, Millwater H, Bagley R. Complex variable methods for shape sensitivity of finite element models. Finite Elements in Analysis and Design. 2011;47:1146-1156. DOI: 10.1016/J.FINEL.2011.05.003

[4] 4. Lee J, Kitanidis PK. Bayesian inversion with total variation prior for discrete geologic structure identification. Water Resources Research. 2013;49:7658-7669. DOI: 10.1002/2012WR013431

[5] 5. Rudin LI, Osher S, Fatemi E. Nonlinear total variation based noise removal algorithms. Physica D: Nonlinear Phenomena. 1992;60:259-268. DOI: 10.1016/0167-2789(92)90242-F

[6] 6. Christensen PW, Klarbring A. Two-Dimensional Shape Optimization. An Introd. to Struct. Optim. Netherlands: Springer; 2008. DOI: 10.1007/978-1-4020-8666-3_7

[7] 7. Haslinger J, Mäkinen RA. Introduction to Shape Optimization-Theory, Approximation, and Computation. Philadelphia: SIAM; 2003

[8] 8. Koch MC, Osugi M, Fujisawa K, Murakami A. Hamiltonian Monte Carlo for simultaneous interface and spatial field detection (HMCSISFD) and its application to a piping zone interface detection problem. International Journal for Numerical and Analytical Methods in Geomechanics. 2021;45:2602-2626. DOI: 10.1002/NAG.3279

[9] 9. Koch MC, Fujisawa K, Murakami A. Adjoint Hamiltonian Monte Carlo algorithm for the estimation of elastic modulus through the inversion of elastic wave propagation data. International Journal for Numerical Methods in Engineering. 2020;121:1037-1067. DOI: 10.1002/nme.6256

[10] 10. Koch MC, Fujisawa K, Murakami A. Novel parameter update for a gradient based MCMC method for solid-void interface detection through elastodynamic inversion. Probabilistic Engineering Mechanics. 2020;62:103097. DOI: 10.1016/j.probengmech.2020.103097

[11] 11. Loève M. Probability Theory II. New York: Springer-Verlag; 1978

[12] 12. Ghanem RG, Spanos PD. Stochastic Finite Elements: A Spectral Approach. Springer New York: New York, NY; 1991. DOI: 10.1007/978-1-4612-3094-6

[13] 13. Sudret B, Der Kiureghian A. Stochastic Finite Element Methods and Reliability: A State-of-the-Art Report. UCB/SEMM-2000/08. Berkeley: University of California; 2000

[14] 14. Betz W, Papaioannou I, Straub D. Numerical methods for the discretization of random fields by means of the Karhunen–Loève expansion. Computer Methods in Applied Mechanics and Engineering. 2014;271:109-129. DOI: 10.1016/J.CMA.2013.12.010

[15] 15. Marzouk YM, Najm HN. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. Journal of Computational Physics. 2009;228:1862-1902. DOI: 10.1016/j.jcp.2008.11.024

[16] 16. Leimkuhler B, Reich S. Simulating Hamiltonian Dynamics. Cambridge: Cambridge University Press; 2005. DOI: 10.1017/CBO9780511614118

[17] 17. Betancourt M. A conceptual introduction to hamiltonian Monte Carlo. ArXiv Preprint. 2017

[18] 18. Andrieu C, De Freitas N, Doucet A, Jordan MI. An introduction to MCMC for machine learning. Machine Learning. 2003:5-43. DOI: 10.1023/A:1020281327116

[19] 19. Stein K, Tezduyar T, Benney R. Mesh moving techniques for fluid-structure interactions with large displacements. Journal of Applied Mechanics. 2003;70:58-63. DOI: 10.1115/1.1530635

[20] 20. Koch MC, Osugi M, Fujisawa K, Murakami A. Investigation of reference parameter invariance and application of HMCSISFD for identification of an eroded seepage zone. Proc. 13th Int. Conf. Struct. Saf. Reliab. 2021–2022, Shanghai. 2022

[21] 21. Magnus JR. On Differentiating Eigenvalues and Eigenvectors. Econometric Theory. 1985;1:179-191. DOI: 10.1017/S0266466600011129

Numerical Gradient Computation for Simultaneous Detection of Geometry and Spatial Random Fields in a Statistical Framework

Inverse Problems - Recent Advances and Applications

Abstract

Keywords

Author Information

Michael Conrad Koch*

Kazunori Fujisawa

Akira Murakami

1. Introduction

2. Inversion preliminaries

2.1 Governing and observation equations

2.2 Karhunen-Loève (K-L) expansion

2.3 Galerkin finite element method to solve the integral eigenvalue problem

3. Simultaneous geometry and spatial field detection

3.1 Hamiltonian Monte Carlo

3.2 Parameter update using the mesh moving method

3.3 Gradient computation

4. Numerical implementation and results

4.1 Observation data

Figure 1.

Figure 2.

4.2 Inversion setup

Figure 3.

4.3 Inversion and convergence analysis

Figure 4.

Figure 5.

Figure 6.

5. Conclusions

Acknowledgments

Conflict of interest

References

Solving and Algorithm for Least-Norm General Solution to Constrained Sylvester Matrix Equation

Numerical Gradient Computation for Simultaneous Detection of Geometry and Spatial Random Fields in a Statistical Framework

Inverse Problems - Recent Advances and Applications

Abstract

Keywords

Author Information

Michael Conrad Koch*

Kazunori Fujisawa

Akira Murakami

1. Introduction

2. Inversion preliminaries

2.1 Governing and observation equations

2.2 Karhunen-Loève (K-L) expansion

2.3 Galerkin finite element method to solve the integral eigenvalue problem

3. Simultaneous geometry and spatial field detection

3.1 Hamiltonian Monte Carlo

3.2 Parameter update using the mesh moving method

3.3 Gradient computation

4. Numerical implementation and results

4.1 Observation data

Figure 1.

Figure 2.

4.2 Inversion setup

Figure 3.

4.3 Inversion and convergence analysis

Figure 4.

Figure 5.

Figure 6.

5. Conclusions

Acknowledgments

Conflict of interest

References

Continue reading from the same book

Inverse Problems