Open access peer-reviewed chapter

# Matrices, Moments and Quadrature: Applications to Time- Dependent Partial Differential Equations

Written By

James V. Lambers, Alexandru Cibotarica and Elisabeth M. Palchak

Submitted: 08 October 2015 Reviewed: 14 January 2016 Published: 06 July 2016

DOI: 10.5772/62247

From the Edited Volume

## Applied Linear Algebra in Action

Edited by Vasilios N. Katsikis

Chapter metrics overview

View Full Metrics

## Abstract

The numerical solution of a time-dependent PDE generally involves the solution of a stiff system of ODEs arising from spatial discretization of the PDE. There are many methods in the literature for solving such systems, such as exponential propagation iterative (EPI) methods, that rely on Krylov projection to compute matrix function-vector products. Unfortunately, as spatial resolution increases, these products require an increasing number of Krylov projection steps, thus drastically increasing computational expense.

### Keywords

• exponential propagation iterative methods
• Krylov subspace spectral methods
• stiffness
• Lanczos iteration

## 1. Introduction

Consider an autonomous system of ordinary differential equations (ODEs) of the form

y'=Fy,yt0=y0, E1

such as one that would naturally arise from spatial discretization of a partial differential equation (PDE). Such systems are generally stiff when the underlying PDE has significant diffusive or advective terms, and this stiffness is exacerbated by increasing the spatial resolution. Stiffness leads to significantly increased computational expense for both explicit and implicit time-stepping methods. For explicit methods, the time step is severely restricted by the CFL condition, while implicit methods require the solution of ill-conditioned systems of linear equations during each time step. Such systems are generally sparse and therefore best suited for iterative methods, but for these stiff systems of ODEs, iterative methods require many iterations or a specially developed preconditioner [1].

Exponential propagation iterative (EPI) methods, introduced by Tokman et al. [1, 2], are Runge-Kutta like time-stepping methods that are designed to minimize the number of matrix function-vector products of the form w = φ()b, where φ is a smooth function, A is a square matrix, τ is a parameter determined by the time step, and b is a column vector. The approach used by EPI methods to compute w for a given symmetric matrix A is to apply the Lanczos algorithm to A with the initial vector b, until we obtain a matrix Xj with orthonormal columns and a tridiagonal matrix Tj such that XjTAXj=Tj. Then, we can compute the approximation

wj=b2XjφTjτe1, E2

where e1=100T. Since each column xk of the matrix Xj is of the form xk = pk − 1(A)b, where pn(A) is a polynomial of degree n in A, wj is the product of a polynomial in A of degree j − 1 and b. Since the matrix A arises from a stiff PDE, the eigenvalues of A are distributed over several orders of magnitude. As it is generally not possible to accurately approximate φ(λ) on such a large interval with a low-degree polynomial, a large number of Lanczos iterations is generally necessary in order to obtain a sufficiently accurate of w.

The difficulty that time-stepping methods have with stiffness stems from the fact that low- and high-frequency components of the solution, which change at widely varying speeds, are coupled and therefore must be evolved using a common time step. While the low-frequency components generally make the most significant contribution to the solution, the high-frequency components change most rapidly, and therefore constrain the time step. The greater the spatial resolution, the greater the bandwidth of the solution, thus constraining the time step even further.

This coupling, however, is not the only cause of the greater computational expense incurred by time-stepping methods applied to stiff systems. A key contributing factor is the use of the same polynomial or rational function to approximate all of these components of φ()b, when such a function cannot effectively approximate φ(λτ) on a large interval except at high degree, which increases computational expense. An alternative is to use Krylov subspace spectral (KSS) methods [3, 4], which employ a component-wise approach to these matrix function-vector products. In KSS methods, which are explicit, each Fourier coefficient of the solution is computed using an interpolating polynomial with frequency-dependent interpolation points. This individualized approach to computing each Fourier component yields high-order accuracy and stability like that of implicit methods, even though KSS methods themselves are explicit.

To date, KSS methods have been applied mostly to linear PDEs on d-dimensional boxes, for d = 1, 2, 3, with either periodic or homogeneous Dirichlet or Neumann boundary conditions. A first-order KSS method was applied to nonlinear diffusion equations for image processing by Guidotti et al. [5], but for that problem, a straightforward linearization was used during each time step. In order to compute solutions of nonlinear PDEs with higher-order accuracy in time, it is necessary to treat the nonlinearity more carefully. This can be accomplished by combining KSS methods with another high-order time-stepping method, such as EPI methods.

In this paper, this combination is presented, for the purpose of solving systems of ODEs of the form (1) that are obtained through spatial discretization of nonlinear PDEs defined on rectangular domains with periodic, homogeneous Dirichlet, or homogeneous Neumann boundary conditions. The proposed algorithm, first described in [6, 7], uses the component-wise approach of KSS methods to more efficiently compute matrix function-vector products of the form y = φ()b. The following features distinguish the combined approach from previous work on either EPI or KSS methods:

• Instead of applying Krylov projection (e.g., see [810]) with initial vector b to compute y, it is applied only to a low-frequency projection of b, in order to avoid the larger number of iterations that Krylov projection typically incurs at higher spatial resolution. Furthermore, for advection-dominated problems, denoising is applied to the Krylov subspace basis vectors, to eliminate spurious high-frequency oscillations in these vectors which slow convergence.

• For the high-frequency portion of b, a KSS method, as described in [11], is used to apply φ(). In this latest version of KSS methods, each Fourier component of the output vector y is still approximated using its own block quadrature rule, as before. However, now the quadrature nodes (i.e., frequency-dependent interpolation points for φ) are obtained through high-frequency analysis of block Lanczos iteration, which yields simple formulas for the nodes, instead of having to obtain them by solving an eigenvalue problem [3]. This greatly improves efficiency over the approach used in [3, 4], in which the nodes are obtained by explicitly applying block Lanczos iteration to A for each Fourier component, without sacrificing accuracy [11].

• Once the frequency-dependent interpolation points are obtained, it is necessary to construct and apply frequency-dependent interpolating polynomial approximations pω(λ) of φ to A, for each wave number ω, and then the Fourier coefficients of the solution are inner products of the form e^ωHpωAτb, where êω is a discretization of an appropriate Fourier basis function. This paper provides implementation details for this task, and explains how it can be accomplished with as few Fourier transforms as possible [6, 7].

The outline of the paper is as follows. Section 2 gives an overview of KSS methods. Section 3 reviews their acceleration based on asymptotic analysis of recursion coefficients, first presented in [11], and discusses extension to non-self-adjoint operators. Section 4 provides a brief overview of EPI methods and demonstrates the spurious high-frequency oscillations that can occur when using standard Krylov projection within an EPI method. Section 5 describes how KSS and EPI methods are combined. Numerical results are presented in Section 6. Concluding remarks and directions for future work are given in Section 7.

## 2. Matrices, moments, quadrature, and PDE

To provide context for the latest advancements with KSS methods, we begin with an overview of KSS methods as described in [3], applied to the 1-D parabolic PDE

ut+Lu=0,0<x<2π,t>0,Lu=pxuxx+qxu, E3
ux0=fx,0<x<2π, E4
u0t=u2π,t,t>0. E5

where p(x) > 0 and q(x) ≥ 0 on [0, 2π], and q(x) is not identically zero. In KSS methods, each Fourier coefficient of the solution ũ(xtn + 1) is obtained by applying the exact solution operator of the PDE to ũ(xtn). For a given wave number ω, such a Fourier coefficient is given by

u^ωtn+1=12πeiωx,eLΔtu˜xtn, E6

where

fg=02πfx¯gxdx E7

is the standard inner product on [0, 2π] and e− LΔt is the solution operator of the PDE (3).

### 2.1. Matrices, moments, and quadrature

The spatial discretization of (6) yields the bilinear form

uHφAv, E7

where u=12πeiωx and v = ũ(xtn) are N-vectors, A = LN, where LN is a spectral discretization of L, and φ(λ) = e− λt.

Because the Sturm-Liouville operator L is self-adjoint and positive definite, it follows that the matrix A is symmetric positive definite. As such, it has real eigenvalues b = λ1 ≥ λ2 ≥ ⋯ ≥ λN = a > 0, and corresponding orthonormal eigenvectors qj, j = 1, …, N. From the spectral decomposition of A, we obtain the following representation of (6):

uHφAv=j=1NφλjuHqjqjHv. E8

As described by Golub and Meurant in [12], (8) can be viewed as a Riemann-Stieltjes integral

uHφAv=abφλdαλ, E9

where the measure α(λ) is defined by

αλ=0j=iNujvj,j=1Nujvj,ifλ<aifλiλ<λi1,ifbλuj=uHqj,vj=qjHv. E1000

This allows approximation of (6) using Gaussian quadrature rules, where the nodes and weights are obtained by applying the Lanczos algorithm to A with initial vectors u and v [12].

Figure 1 demonstrates why integrals of the form (8) can be approximated accurately with a small number of nodes in the case where A is a discretization of a differential operator and the vector u is used to extract a particular Fourier coefficient of f(A)v. We examine the distribution (λ) in the case where u = v = e iωx for small and large values of ω, and for A discretizing a differential operator of the form − ∂x a(x)∂x, with a(x) > 0 being a smooth function or a piecewise constant function. In either case, (λ) is mostly concentrated within a portion of the interval of integration [ab]. Gaussian quadrature rules for such integrals naturally target these relevant portions [3, 13].

Block Lanczos algorithm
X0 = 0, R0=uv, R0 = X1 B0 (QR factorization)
for n = 1, 2, …, K
V = AXn
Mn=XnHV
Rn=VXn1Bn1HXnMn
Rn = Xn + 1 Bn (QR factorization)
end

In the case where u ≠ v, the presence of a negative weight would destabilize the quadrature rule [14]. Alternatively, we consider the approximation of the 2 × 2 matrix integral

uvHφAuv. E10

We use the most general K-node quadrature formula, as described in [12], to get an approximation for (9) of the form

abφλdμλ=j=12KφλjvjvjH+error, E11

where, for each j, λj is a scalar and vj is a 2-vector. Each node λj is an eigenvalue of the matrix

TK=M1B1HB1M2B2HBK1MK, E12

which is a block-tridiagonal matrix of order 2K. The vector vj consists of the first two elements of the corresponding normalized eigenvector. The matrices Mj and Bj are computed using the block Lanczos algorithm [15], described below.

### 2.2. Krylov subspace spectral methods

The block KSS method [3, 4] for (3) begins by defining

R0=e^ωun, E13

where êω is a discretization of 12πeiωx and u n is a discretization of the approximate solution u(xt) at time tn = nΔt. Next, we compute the QR factorization of R0,

R0=X1ωB0ω E14

which outputs

X1ω=e^ωuωnuωn2 E14

and

B0ω=1e^ωHun0uωn2, E15

where

uωn=une^ωe^ωHun=une^ωu^ωtn. E16

Then we apply the block Lanczos algorithm [15] to the matrix LN, which comes from the discretization of L, with initial block X1(ω). This produces a block tridiagonal matrix TK of the form (12), where every entry of TK is a function of ω. Then, at time tn + 1, each Fourier coefficient of the solution is

u^n+1ω=[B0ωHE12HexpTKωΔtE12B0(ω)]12,E12=e1e2. E17

An inverse FFT applied to the vector of Fourier coefficients û n + 1 yields the vector û n + 1, which consists of the values of the solution u(xtn + 1) at the grid points.

This algorithm has temporal accuracy O(Δt2K − 1) for parabolic problems [3]. Even higher-order accuracy, O(Δt4K − 2), is obtained for the second-order wave equation [4], due to the second-order time derivative. Furthermore, under appropriate assumptions on the coefficients of the PDE, the 1-node KSS method is unconditionally stable [3, 4]. More generally, the temporal order of accuracy is O(Δt(2K − 1)d), where d is the highest order of a time derivative in the PDE; this order of accuracy has been observed with the Schrödinger equation [16] and Maxwell’s equations [18].

## 3. Asymptotic analysis of block Lanczos iteration

For each Fourier coefficient of the solution, KSS methods use a different quadrature rule, because the measure α(λ) in (8) is defined in terms of the frequency. It follows that if S(LΔt) is the solution operator of the PDE (e.g. S(LΔt) = e− LΔt for the PDE (3)), then the function S(λΔt) is approximated by a polynomial of degree 2K that interpolates S(λΔt) at different points for each frequency, which are the block Gaussian quadrature nodes λj in (11). Taking all Fourier coefficients together, the computed solution at time tn + 1, u n + 1 can be expressed as

un+1=j=02K1DjΔtLNjun, E18

where LN is a discretization of L on an N-point grid and Dj(Δt) is a matrix that is diagonal in discrete Fourier space. The diagonal entries are the coefficients of these frequency-dependent interpolating polynomials in power form, with each row corresponding to a different frequency.

In the block KSS method described in [3, 4] and reviewed in the previous section, these interpolation points are the eigenvalues of the block tridiagonal matrix TK from (12) that is produced by block Lanczos iteration. Although each such matrix is small, computing the eigenvalues and eigenvectors for each frequency is computationally expensive. Therefore, in this section, we describe a much faster approach to obtaining estimates of these nodes, at least for high frequencies. This approach was first presented in [11] and generalized in [6, 7]. The basic idea is to examine the entries of TK as |ω| → ∞, where ω is the wave number.

### 3.1. The block case

As in the previous section, let u n be a discretization of the approximate solution u(xt) at time tn = nΔt on a uniform N-point grid. Then, KSS methods use the initial block R0=e^ωun, for each ω = − N/2 + 1, …, N/2. We start the first iteration of the block Lanczos algorithm by finding the QR-factorization of R0:

R0=X1B0, E18

where

X1=e^ωuωnuωn2andB0=1u^ωtn0uωn2. E18

with uωn is defined as in (16). We note that if the solution u is continuous, then as |ω| → ∞, |û n(ω)| → 0, so that in the limit B0 is diagonal.

The next step is to compute

M1=X1HLNX1, E19

where the matrix LN is a spectral discretization of the operator L defined by Lu = puxx + q(x)u, with p being a constant. Substituting the value of X1 from (18) into (19) yields

M1=ω2p+q¯LNuωn^ωuωn2LNuωn^ω¯uωn2RLNuωn, E20

where q¯ is the mean of q(x) on (0, 2π), LNuωn^ω=e^ωHLNuωn is the Fourier coefficient of the grid function LN u n corresponding to the wave number ω, and RLNuωn=uωn,LNuωnuωnuωn is the Rayleigh quotient of LN and uωn. As |ω| increases, the Fourier coefficients of a continuous function decay to zero; therefore, as long as the solution is sufficiently regular, the non-diagonal entries of M1 become negligible, that is,

M1ω2p+q¯00RLNun. E222

Proceeding with the iteration, and neglecting any terms that are Fourier coefficients or are of lower order in ω, we obtain

R1=LNX1X1M1q˜e^ωLNuωnuωn2RLNuωnuωnuωn2, E20

where q is a vector consisting of the value of q(x) at the grid points, q˜=qq¯, and multiplication of vectors is component-wise.

To obtain X2, we perform the QR-factorization R1 = X2 B1. We note that the (1, 2) entry of B1, modulo lower-order terms, is the Fourier coefficient v^1ω, where

v1=q˜LNuωnuωn2RLNuωnuωnuωn2. E20

It follows that if the coefficient q(x) and solution u(xt) are sufficiently regular, then B1 approaches a diagonal matrix as |ω| → ∞, just as B0 does. Continuing this process, it can be shown that given sufficient regularity of the solution and coefficients of L, each block Mj or Bj of TK from (12) becomes approximately diagonal at high frequencies.

Because TKij0 when i + j is odd in the high-frequency case, it follows that if the rows and columns of TK are permuted so that odd-numbered and even-numbered rows and columns are grouped together and in order, then the eigenvalue problem for TK approximately decouples. Therefore, approximate eigenvalues of TK can be obtained by computing the eigenvalues of the tridiagonal matrices obtained by performing “non-block” Lanczos on LN with initial vectors equal to the two columns of R0 separately, rather than applying block Lanczos with these two columns together in the initial block. In [11], it is shown that this decoupling also takes place if the leading coefficient p(x) of L is not constant.

### 3.2. The non-block case

The decoupling observed in the preceding discussion reveals that we can obtain approximations of half of the block Gaussian quadrature nodes for all Fourier coefficients by applying “non-block” Lanczos iteration to the matrix LN with initial vector u n, the computed solution, as is done in Krylov projection methods such as those described in [810]. These nodes, denoted by λ1, …, λm where m is the number of iterations, will be referred to as frequency-independent nodes. Because this iteration does not depend on the wave number ω, the frequency-independent nodes need only be computed once for each vector u for which an expression of the form φ(LN τ)u n is required. The other half of the nodes can be estimated through an asymptotic analysis of Lanczos iteration applied to LN with initial vector êω [11]; these are called frequency-dependent nodes and will be denoted by λ1,ω, …, λm,ω.

As an example, we consider the case where the matrix A comes from a spectral discretization of the operator Lu = − puxx + q(x)u, where p is a constant, and assuming periodic boundary conditions [11]. Carrying out three iterations, which corresponds to a fifth-order accurate KSS method for a parabolic PDE, yields the following recursion coefficients as functions of the wave number ω, after neglecting lower-order terms:

α1β1¯0β1α2β2¯0β2α3pω2q˜20q˜2pω22p|ω|qx2/q˜202p|ω|qx2/q˜2pω2. E20

It follows that the frequency-dependent nodes can easily be estimated as

λ1,ω=pω2,λi,ω=pω2±β12+β22,i=2,3 E20

whereas for a third-order KSS method, the frequency-dependent nodes are

λ1,ω=pω2+β1,λ2,ω=pω2β1. E21

In [6, 7], similar formulas for the nodes are derived for a PDE with homogeneous Neumann boundary conditions, and for a 2-D PDE with periodic boundary conditions.

When the matrix A is a finite-difference representation of the underlying differential operator, the block Gaussian quadrature nodes can be represented more accurately if formulas for the eigenvalues of symmetric Toeplitz matrices are used for the leading-order terms in the nodes. For example, in (20), 2 is replaced by 2p(N/π)2(1 − cos(πω/N)), where N is the number of grid points.

The theory developed in [12] applies to symmetric positive definite matrices, but this property is not essential [17, 18]. That said, care must be taken with nonsymmetric matrices, as a straightforward use of unsymmetric Lanczos to treat bilinear forms as Riemann-Stiejtjes integrals, as described in [19], can suffer from “serious breakdown” [20]. Therefore, a more robust approach for applying KSS methods to linear PDE with non-self-adjoint spatial differential operators is to use Arnoldi iteration to approximate φ()b. The algorithm for Arnoldi iteration, applied to a matrix A and initial vector z0, is as follows:

Arnoldi iteration
v1 = z0/∥z0 ∥ 2
for j = 1, 2, …
zj = Avj
for k = 1, 2, …, j
hkj=vkHzj
zj = zj − hkj vk
end
hj + 1,j = ∥ zj ∥ 2
vj + 1 = zj/hj + 1,j
end

The output of Arnoldi iteration is an upper Hessenberg matrix Hm, and a matrix Vm with orthonormal columns, such that

AVm=VmHm+hm+1,mzm+1emH. E21

By analogy with (17), to approximate u n + 1 = φ()u n, we can compute each Fourier coefficient [û n + 1]ω of u n + 1 by applying block Arnoldi iteration [21] to A, with initial block R0(ω) as defined in (13). After m iterations, this iteration produces a block upper Hessenberg matrix Hm(ω), from which we obtain

u^n+1ω=B0ωHE12HφHmωτE12B0ω12, E22

where B0(ω) and E12 are as defined in (15) and (17), respectively. As shown in [6, 7], like with block Lanczos, the eigenvalue problem for Hm(ω) approximately decouples for high frequencies, due to the decay of the Fourier coefficients of u n. It follows that we can easily estimate the frequency-dependent eigenvalues of Hm(ω), which are used as interpolation points for a polynomial approximation of φ(λ), by applying “non-block” Arnoldi iteration, as described in the above algorithm, with an initial vector z0 chosen to be a discretization of an appropriate Fourier basis function. Additional details can be found in [6, 7].

## 4. EPI methods

In this section, we give an overview of EPI methods, as developed by Tokman et al. in [1, 2]. To solve an autonomous system of ODEs of the form (1), a time step from time tn to time tn + 1 = tn + Δt is taken as follows. First, the time derivative F(y) is expressed in terms of its Taylor expansion around y(tn), which yields

dydt=Fytn+Anytytn+Ryt, E23

where An is the Jacobian matrix of F(y) evaluated at y(tn), and R(y(t)) is the Taylor remainder. Next, we multiply both sides by an integrating factor eAnt and then integrate from tn to tn + 1 to obtain

ytn+1=ytn+eAnΔtIAn1Fytn+tntn+1eAntn+1τRyτdτ. E24

The integral on the right side is then approximated using a quadrature rule. This requires the evaluation of matrix function-vector products of the form φ()b, with A = An, for various “φ-functions” and various choices of the vectors b and scaling factors τ, which are determined so as to satisfy order conditions.

Any such matrix function-vector product is computed using Krylov projection. Arnoldi iteration is applied to A (or Lanczos iteration, if A is symmetric) with initial vector b. After m iterations, we obtain (21), from which we obtain an approximation

φAτbb2VmφHmτe1, E25

where, as before, Hm is upper Hessenberg (or tridiagonal of A is symmetric), and the columns of Vm form an orthonormal basis for the Krylov subspace

KAbm=spanb,Ab,A2b,,Am1b. E26

The accuracy of this approximation is discussed in [9]. In the case where A is ill-conditioned, the number of iterations m needed for convergence of (25) can be quite large, and this is exacerbated by increasing the spatial resolution in the discretization of the underlying PDE from which (1) arises.

When the number of iterations is large, an additional issue, particularly for advection-dominated problems, is the appearance of spurious high-frequency oscillations in the columns of Vm, even if the initial vector b represents a smooth function. As can be seen in Figure 2, even after relatively few iterations, these oscillations can occur and reduce the columns of Vm to essentially noise.

This can be alleviated by filtering out high-frequency components of the columns of Vm after each matrix-vector multiplication. As can be seen in Figure 3, the spurious high-frequency oscillations are reduced, resulting in more meaningful data in Vm. Future work will include the automatic, adaptive selection of an appropriate threshold for filtering high-frequency components.

The behavior of the unfiltered Krylov vectors is not surprising, as similar behavior is displayed by the unsmoothed Fourier method applied to hyperbolic PDEs [22]. In that work, the proposed remedies were to either use filtering, or increase the number of grid points; the former remedy serves as the motivation for denoising in this context. It is worth noting that in both the unfiltered and filtered cases, no loss of orthogonality was observed in the Krylov vectors; that is, 1/mVmTVmImF was negligibly small (that is, on the order of 10− 10 or smaller) in both cases.

## 5. KSS-EPI methods

We now describe the combination of KSS and EPI methods. The EPI method itself is not changed; what is modified is the approach to computing any matrix function-vector product of the form φ()b. The main steps in the new approach are as follows:

1. First, it is necessary to determine a cutoff frequency Nc. In the numerical experiments presented in this paper, the value of Nc has been determined by experimentation; it is demonstrated in [7] that the performance is not unduly sensitive to the choice of Nc. In future work, an adaptive approach to choosing Nc will be developed.

2. Using an FFT, b is decomposed into low-frequency and high-frequency components bL and bH. Specifically, b = bL + bH, where each Fourier coefficient of bL with wave number ω is zero if ωNc.

3. φ()bL is computed using Krylov projection, as is normally done in EPI methods.

4. φ()bH is computed using a KSS method, as described in Section 3.

5. The results of the preceding two steps are added to obtain an approximation of φ()b.

We now provide details on how step 4 can be performed efficiently, by minimizing the number of FFTs. To simplify the exposition, we consider the 1-D case, with periodic boundary conditions. For each wave number ω, we obtain the frequency-dependent nodes {λ1,ωλ2,ω, …, λK,ω}, as described in Section 3, by approximating the entries of the tridiagonal matrix TKω produced by Lanczos iteration with initial vector êω. As demonstrated in Section 3, this is readily accomplished using the coefficients of the spatial differential operator. Next, we use Krylov projection with initial vector bH, as in Section 4, and compute the eigenvalues of the resulting tridiagonal (or Hessenberg, if A is nonsymmetric) matrix. These are the frequency-independent nodes {λ1λ2, …, λK}.

The Fourier coefficient of φ()bH corresponding to the wave number ω is obtained by computing the same Fourier coefficient of p2K − 1,ω()bH, where p2K − 1,ω is the polynomial interpolant of φ(λ) with interpolation points λiλi,ωi=1K. Expressing this interpolant in Newton form, we have

p2K1,ωλ=j=1Kφλ1λji=1j1λλi+j=1Kφλ1λKλ1,ωλj,ωi=1j1λλi,ωk=1Kλλk. E26

Arranging the interpolation points in the order indicated above allows us to reduce the number of FFTs needed. Using the relation from Lanczos iteration,

AXK=XKTK+rKeKT, E27

we define

v=pK1AbH=φλ1+φλ1λ2Aλ1I++φλ1λ2λKAλ1IAλK1IbH=bH2XKpK1Ae1,w=qKAbH=Aλ1IAλKIbH=β1β2βK1rK=β1βKXK+1eK+1, E28

and

p˜K1,ωλ=φλ1λKλ1,ω+φλ1λ2,ωλλ1,ω++φλ1λK,ωλλ1,ωλλK1,ω=CK1ωλK1++C1ωλ+C0ω, E28

where Cjω, for j = 0, 1, …, K − 1, are the coefficients of p˜K1,ω in power form, which can easily be computed by repeated application of nested multiplication to the last K terms of the Newton form of p2K − 1,ω.

Finally, using F to denote the discrete Fourier transform, we have

φAτbHp2K1,ωAτbH=v+p˜K1,ωAw=v+F1j=0K1CjωAjw, E28

and it can easily be seen that the solution at each time step requires K FFTs and one inverse FFT. It is worth noting that once the Krylov subspace vectors A j w, j = 0, 1, …, K − 1 are computed, the K FFTs can be performed in parallel.

## 6. Numerical results

In this section, we compare several versions of EPI methods, as applied to two test problems; additional test problems are featured in [6, 7]. The versions differ in the way in which they compute matrix function-vector products of the form φ()b:

• standard Krylov projection, as in (25), hereafter referred to as “Krylov-EPI”, either with or without denoising as in Section 4;

• using the KSS approach, as described in Section 5, hereafter referred to as “KSS-EPI”;

• Newton interpolation using Leja points [23], hereafter referred to as “LEJA”; and

• adaptive Krylov projection [24], hereafter referred to as “AKP”.

All of these approaches are used in the context of two EPI methods. The first is a third-order, two-stage EPI method [1]

Y1=yn+13ha11φ113hAFyn,yn+1=yn+hφ1hAFyn+3hb1φ2hAFY1FynAY1yn, E29

where a11 = 9/4 and b1 = 32/81, and

RY1=FY1FynAY1yn. E30

For this method,

φ1λ=eλ1λ,φ2λ=eλλ1λ2,φ3λ=eλ6λ6+5λ+2λ2λ3. E30

The second is a fifth-order, three-stage EPI method [25]

Y1=yn+ha11ψ1g11hAFyn,Y2=yn+ha21ψ1g21hAFyn+ha22ψ2g22hARY1,yn+1=yn+hb1ψ1g31hAFyn+hb2ψ2g32hARY1+hb3ψ3g33hA2RY1+RY2, E30

where

ψiz=j=1jpijφjz,i=1,2,3, E31

and the coefficients gij, aij, bj, and pij are obtained from the description of the EPIRK5s3 method in [25].

For all methods used to compute matrix function-vector products, efficiency will be measured in two ways: (1) total execution time required to integrate over the entire time interval, and (2) the average number of matrix-vector products (plus the average number of FFTs, in the case of KSS-EPI) performed for each evaluation of a matrix function-vector product of the form φ()b. The phrase “number of iterations” is used throughout this section to refer to this quantity. Throughout this section, for the purpose of discussing performance as a function of spatial resolution, N refers to the number of grid points per dimension.

### 6.1. Diffusive problem

For our first test problem, we choose a diffusion-dominated PDE: the 2-D Allen-Cahn equation,

ut=α2u+uu3,x,y01,t00.2 E31

with α = 0.1. We impose homogeneous Neumann boundary conditions, and the initial condition

u0xy=0.4+0.1cos2πxcos2πy. E32

The Laplacian is discretized using a centered finite difference. For KSS-EPI, the low-frequency portion bL consists of all components with wave numbers ωi ≤ 7, i = 1, 2. That is, for this problem, the value of Nc, as defined in the previous section, is 7. The low value of this threshold is due to the smoothness of the initial data.

The results are shown in Figures 4 and 5 for the third-order EPI method (29), and in Figures 6 and 7 for the firth-order EPI method (30). It can be seen from Figures 4 and 6 that for both EPI methods, the accuracy of all four approaches to computing matrix function-vector products are comparable. However, the efficiency and scalability of these four approaches are very different. In particular, these figures show that for KSS-EPI methods, the growth in execution time as a function of the number of grid points per dimension is much less. For example, note that for the coarsest grid, with N = 50, the speed of KSS-EPI is similar to that of Krylov-EPI, but for N = 150 and N = 300, KSS-EPI is much faster. Furthermore, as N increases, this gap in performance grows. This is due to the fact that while both methods use Krylov projection, KSS-EPI applies it to an initial vector with only low-frequency components, thus significantly reducing the number of iterations needed for convergence.

The difference in scalability is more clearly illustrated in Figures 5 and 7. It can be seen that for KSS-EPI, the number of overall iterations (matrix-vector multiplications + FFTs) shows almost no sensitivity to the grid size, compared to Krylov-EPI, AKP and Leja interpolation, all of which exhibit substantial growth as the number of grid points increases. It can also be seen that denoising results in a slightly higher number of overall iterations, so it is not beneficial for this problem. This is not surprising, as this is a diffusive problem that has a smooth solution and is therefore less susceptible to high-frequency oscillations during Lanczos iteration. Although AKP is slightly more accurate than KSS-EPI for fifth order, this is more than offset by the superior efficiency of KSS-EPI with denoising, particularly at larger grid sizes and larger time steps.

For our second test problem, we choose an advection-dominated PDE: a 1-D Burgers’ equation

ut+uux=νuxx,x01,t01 E32

with ν = 0.03. We impose homogeneous Dirichlet boundary conditions, and the initial condition

u0x=sin33πx1x3/2. E33

For KSS-EPI, the low-frequency portion bL consists of all components with wave numbers ω ≤ Nc = 40. The higher threshold for this problem, compared to the Allen-Cahn equation (31), is due to the fact that the initial data is less smooth.

The results are shown in Figures 8 and 9 for the third-order EPI method (29), and in Figures 10 and 11 for the fifth-order EPI method (30). From Figures 8 and 10, it can be seen that Krylov-EPI and KSS-EPI have comparable accuracy, but even for the coarsest grid with N = 500 points, KSS-EPI is much faster, and as with the Allen-Cahn equation, this gap in performance only grows with N. As shown in Figures 9 and 11, this is again due to the increasing number of Krylov projection steps needed for Krylov-EPI. While Leja interpolation is more accurate than KSS-EPI for both the third- and fifth-order EPI methods, and AKP is also more accurate in the fifth-order case, both of these methods are still significantly slower than KSS-EPI, and a similar scalability gap can also be observed.

The difference in scalability among the four approaches to computing matrix function-vector products is more apparent in Figures 9 and 11. Denoising applied to KSS-EPI is advantageous for this problem, unlike with the Allen-Cahn equation. As can be seen in these figures, for KSS-EPI without denoising, the number of iterations is increasing with the number of grid points, though not nearly as rapidly as with the other methods. However, with denoising included, the insensitivity of the number of iterations to the grid size is restored.

### 6.3. Discussion of efficiency

The major components of the computational cost of KSS-EPI stem from Krylov projection that is applied to low-frequency parts, and FFTs that are applied to both the low- and high-frequency parts. Specifically, suppose the EPI method is of order p, and q Krylov projection steps are needed for convergence of the low-frequency part. Then, the task of evaluating φ()b requires q + p matrix-vector multiplications, (p + 3)/2 FFTs and 2 inverse FFTs if denoising is not used, and q + p matrix-vector multiplications, q + (p + 3)/2 FFTs and q + 2 inverse FFTs if denoising is used. Denoising, therefore, is only worthwhile if the value of q can be substantially reduced, as in the case of Burgers’ equation.

## 7. Conclusions and future work

We have demonstrated that when solving stiff systems of nonlinear ODEs derived from PDEs, the growth in the computational cost that results from an increase in the number of grid points can be significantly reduced by performing a relatively low and grid-insensitive number of Krylov projection steps and FFTs on low- and high-frequency portions of the solution separately, instead of a number of Krylov projection steps on the entire solution that grows substantially with the number of grid points. The component-wise approach employed by KSS methods, in which each Fourier coefficient of a matrix function-vector product φ(τA)b is computed using a frequency-dependent interpolant of φ, allows the Krylov subspace dimension to be bounded independently of the grid size and instead determined by the desired temporal order of accuracy (for the high-frequency part) and by the number of Krylov projection steps required on a coarse grid (for the low-frequency part).

Future work on KSS-EPI methods will focus on the computation of low-frequency Fourier components of the solution, as this step accounts for the most significant part of the running time. This work requires an efficient approach to automatically selecting the threshold Nc that is used to distinguish low-frequency from high-frequency components. Such adaptivity can be based on the smoothness of the solution and the performance of Krylov projection during previous time steps. In addition, computing low-frequency components with methods other than Krylov projection, including Leja interpolation and adaptive Krylov projection, will be investigated. Such combination requires examination of error estimation and stopping criteria for these methods, to determine whether their convergence can be accelerated if it is known that the initial vector represents a bandlimited function.

It is important to note that the decomposition of the block Gaussian quadrature nodes into frequency-dependent and frequency-independent nodes is not limited to cases in which Fourier decomposition is used. In [26], the same idea is used for PDEs defined on a disk, in which a Legendre polynomial expansion is used for the radial part of the solution. The decoupling of the eigenvalue problem for TK first observed in [11], and further exploited in [6, 7], occurs due to the rapid decay of the coefficients in whatever series expansion is used to represent the solution. Future work will include taking advantage of this behavior in order to generalize KSS methods to PDEs defined on non-rectangular domains.

## References

1. 1. Tokman, M.: Efficient integration of large stiff systems of ODEs with exponential propagation iterative (EPI) methods. J. Comp. Phys. 213 (2006) 748–776.
2. 2. Loffeld, J., Tokman, M.: Comparative performance of exponential, implicit and explicit integrators for stiff systems of ODEs. Submitted.
3. 3. Lambers, J. V.: Enhancement of Krylov subspace spectral methods by block Lanczos iteration. Electron. T. Numer. Ana. 31 (2008) 86–109.
4. 4. Lambers, J. V.: An explicit, stable, high-order spectral method for the wave equation based on block Gaussian quadrature. IAENG Journal of Applied Mathematics 38 (2008) 333–348.
5. 5. Guidotti, P., Kim, Y., Lambers, J. V.: Image restoration with a new class of forward-backward diffusion equations of Perona-Malik type. SIAM Journal on Imaging Sciences 6(3) (2013) 1416–1444.
6. 6. Cibotarica, A.: Solution of Nonlinear Time-Dependent PDE Through Componentwise Approximation of Matrix Functions. Ph.D. Dissertation, The University of Southern Mississippi, 2015.
7. 7. Cibotarica, A., Lambers, J. V., Palchak, E. M.: Solution of nonlinear time-dependent PDE through componentwise approximation of matrix functions. Submitted.
8. 8. Hochbruck, M., Lubich, C.: A Gautschi-type method for oscillatory second-order differential equations. Numer. Math. 83 (1999) 403–426.
9. 9. Hochbruck, M., Lubich, C.: On Krylov subspace approximations to the matrix exponential operator. SIAM J. Numer. Anal. 34 (1996) 1911–1925.
10. 10. Hochbruck, M., Lubich, C., Selhofer, H.: Exponential integrators for large systems of differential equations. SIAM J. Sci. Comput. 19 (1998) 1552–1574.
11. 11. Palchak, E. M., Cibotarica, A., Lambers, J. V.: Solution of time-dependent PDE through rapid estimation of block Gaussian quadrature nodes. Lin. Alg. Appl. 468 (2015) 233–259.
12. 12. Golub, G. H., Meurant, G.: Matrices, moments and quadrature. Proceedings of the 15th Dundee Conference, June-July 1993, Griffiths, D. F., Watson, G. A. (eds.), Longman Scientific & Technical (1994)
13. 13. Lambers, J. V.: Explicit high-order time-stepping based on componentwise application of asymptotic block Lanczos iteration, Numerical Linear Algebra with Applications 19(6) (2012) 970–991.
14. 14. Atkinson, K.: An Introduction to Numerical Analysis, 2nd Ed. Wiley (1989)
15. 15. Golub, G. H., Underwood, R.: The block Lanczos method for computing eigenvalues. Mathematical Software III, J. Rice Ed., (1977) 361–377.
16. 16. Lambers, J. V. Krylov subspace spectral methods for the time-dependent Schrödinger equation with non-smooth potentials. Numerical Algorithms 51 (2009) 239–280.
17. 17. Lambers, J. V.: Krylov subspace methods for variable-coefficient initial-boundary value problems. Ph.D. Thesis, Stanford University SC/CM program, 2003.
18. 18. Lambers, J. V.: A spectral time-domain method for computational electrodynamics. Advances in Applied Mathematics and Mechanics 1(6) (2009) 781–798.
19. 19. Saylor, P. E., Smolarski, D. C.: “Why Gaussian quadrature in the complex plane?” Numerical Algorithms 26 (2001) 251–280.
20. 20. Golub, G. H., van Loan, C. F.: Matrix Computations, 3rd Ed., Johns Hopkins University Press (1996).
21. 21. Saad, Y.: Numerical Methods for Large Eigenvalue Problems. Halsted Press, New York (1992).
22. 22. Goodman, J., Hou, T., Tadmor, E.: On the stability of the unsmoothed Fourier method for hyperbolic equations. Numer. Math. 67(1) (1994) 93–129.
23. 23. Bergamaschi, L., Caliari, M., Martnez, A., Vianello, M.: Comparing Leja and Krylov approximations of large scale matrix exponentials. Lecture Notes in Comput. Sci., 6th International Conference, Reading, UK, May 28–31, 2006, Proceedings, Part IV, Alexandrov, V. N., van Albada, G. D., Sloot, P. M. A., Dongarra, J. (eds.), Springer (2006) 685–692.
24. 24. Rainwater, G., Tokman, M.: A new class of split exponential propagation iterative methods of Runge-Kutta type (sEPIRK) for semilinear systems of ODEs. J. Comp. Phys. 269 (2014) 40–60.
25. 25. Tokman, M.: A new class of exponential propagation iterative methods of Runge-Kutta type (EPIRK). J. Comp. Phys. 230 (2011) 8762–8778.
26. 26. Richardson, M., Lambers, J. V.: Krylov subspace spectral methods for polar and cylindrical geometries. In preparation.

Written By

James V. Lambers, Alexandru Cibotarica and Elisabeth M. Palchak

Submitted: 08 October 2015 Reviewed: 14 January 2016 Published: 06 July 2016