Fast Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering

An accurate numerical solution of Electromagnetic scattering problems is critically demanded in the simulation of many real-life applications, such as in the design of industrial processes and in the study of wave propagation phenomena. Electromagnetic (EM) scattering problems address the physical issue of computing the diffraction pattern of the EM radiation that is propagated by a complex body, illuminated by an incident wave. An explicit solution is possible only for simple targets, e.g. for spherical bodies; complicated geometries impose to use approximate numerical techniques.


Introduction
An accurate numerical solution of Electromagnetic scattering problems is critically demanded in the simulation of many real-life applications, such as in the design of industrial processes and in the study of wave propagation phenomena.Electromagnetic (EM) scattering problems address the physical issue of computing the diffraction pattern of the EM radiation that is propagated by a complex body, illuminated by an incident wave.An explicit solution is possible only for simple targets, e.g. for spherical bodies; complicated geometries impose to use approximate numerical techniques.
Until the emergence of high-performance computing in the early eighties, the analysis of scattering problems was afforded by using approximate high frequency techniques such as the shooting and bouncing ray method (SBR) (Lee et al. (1988)).Ray-based asymptotic methods like SBR and the uniform theory of diffraction are based on the idea that EM scattering becomes a localized phenomenon as the size of the scatterer increases with respect to the wavelength.In the last decades, due to impressive advances in computer technology and the introduction of innovative algorithms with limited computational and memory requirement, a more rigorous numerical solution has become possible for many practical applications.
Finite-difference (FD) (Kunz & Luebbers (1993); Taflove (1995)), finite-element (FE) (Silvester & Ferrari (1990); Volakis et al. (1998)) and finite-volume (FV) methods (Bonnet et al. (1998); Botha (2006)) can be used to discretize the Maxwell's equations into a finite volume surrounding the scatterer, giving rise to sparse systems of linear equations.Upon inversion of the system, a solution is computed for all excitations.More recently, alternative approaches based on integral equations are becoming increasingly popular for solving high-frequency EM scattering problems.They reformulate the Maxwell's equations in the frequency domain and solve for the electric and the magnetic currents induced on the surface of the object.Thus integral methods require only a simple description of the surface of the target by means of triangular facets (see an example of discretization in Figure 1).This means that a 3D problem is reduced to solving a 2D surface problem, simplifying considerably the mesh generation especially in the case of moving objects.No artificial boundaries need to be imposed and boundary conditions are automatically satisfied in the case of perfectly conducting objects.
Another interest to use surface discretizations is that they noticeably reduce the effect of grid dispersion errors.Grid dispersion errors occur when a wave has a different phase velocity on the grid compared to the exact solution; they tend to accumulate in space and may introduce spurious solutions over large 3D simulation regions (Bayliss et al. (1985); Jr. (1994); Lee & Cangellaris (1992)).For second-order accurate differential schemes, to alleviate this problem the grid density may grow up to O((kd) 3 ) unknowns in 2D and of O((kd) 4.5 ) in 3D, where k is the wavenumber and d is the approximate diameter of the simulation region.Therefore, the overall solution cost may increase considerably also for practical (i.e.finite) values of wavenumber (Chew et al. (1997)).
Boundary element discretizations are applied in many scientific and engineering areas beside electromagnetics and acoustics, e.g. in biomagnetic and bioelectric inverse modeling, magnetostatic and biomolecular problems, and many other applications (Forsman, Gropp, Kettunen & Levine (1995); Yokota, Bardhan, Knepley, Barba & Hamada (2011)).The potential drawback is that they lead, upon discretization, to large and dense linear systems to invert.Hence fast numerical linear algebra methods and efficient parallelization techniques are urged for solving large-scale boundary element equations efficiently on modern computers.In this chapter we overview some relevant techniques.In Section 2 we introduce the boundary integral formulation for EM scattering from perfectly conducting objects.In Section 4 we discuss fast iterative solution strategies based on preconditioned Krylov methods for solving the dense linear system arising from the discretization.In Section 5 we focus our attention on the design of the preconditioner, that is a critical component of Krylov methods in this context.We conclude our study in Section 5 with some final remarks.

The integral equation context
In an integral equation context, the standard EM scattering problem may be formulated in variational form as follows: Find the surface current j such that for all tangential test functions j t , we have  Γ is the boundary of the object, k the wave number and Z 0 = μ 0 /ε 0 the characteristic impedance of vacuum (ǫ is the electric permittivity and μ the magnetic permeability).Given a continuously differentiable vector field j(x) represented in Cartesian coordinates on a 3D Euclidean space as j(x 1 , x 2 , x 3 )=j , where e x 1 , e x 2 , e x 3 are the unit basis vectors of the Euclidean space, we denote by div j(x) the divergence operator defined as The EFIE formulation can be applied to arbitrary geometries such as those with cavities, disconnected parts, breaks on the surface; hence, it is very popular in industry.
For closed targets, the Magnetic Field Integral Equation (MFIE) can be used, which reads The operator R ext j is defined as and is evaluated in the domain exterior to the object.
Both formulations suffer from interior resonances which make the numerical solution more problematic at some frequencies known as resonant frequencies, especially for large objects.
The problem can be solved by combining linearly EFIE and MFIE.The resulting integral equation, known as Combined Field Integral Equation (CFIE), is the formulation of choice for closed targets.We point the reader to Gibson (2008) for a thorough presentation of integral equations in electromagnetism.
On discretizing Eqn.(1) in space by the Method of Moments (MoM) over a mesh containing n edges, the surface current j is expanded into a set of basis functions { ϕ i } 1≤i≤n with compact support (the Rao-Wilton-Glisson basis, Rao et al. (1982), is a popular choice), then the integral equation is applied to a set of tangential test functions j t .Selecting j t = ϕ j , we are led to compute the set of coefficients {λ i } 1≤i≤n such that for each 1 ≤ i ≤ n.The set of equations (2) can be recast in matrix form as The set of unknowns are associated with the vectorial flux across an edge in the mesh.The right-hand side varies with the frequency and the direction of the illuminating wave.

Fast matrix solvers for boundary element equations
Linear systems issued from boundary element discretizations may be very large in applications, although their size is typically much smaller compared to those arising from FE or FV formulations of the same problem.The number of unknowns grows linearly with the size of the scatterer and quadratically with the frequency of the incoming radiation (Bendali (1984)).A target with size of a few tens of wavelength, illuminated at O(1) GHz of frequency, may lead to meshes with several million points (Sylvand (2002)).Some efficient out-of-core dense direct solvers based on variants of Gaussian elimination have been proposed for solving blocks of right-hand sides, see e.g.Alléon, Amram, Durante, Homsi, Pogarieloff & Farhat (1997); Chew & Wang (1993).However, the memory requirements of direct methods are not affordable for solving such systems in realistic applications, even on modern parallel computers.Iterative methods can solve the problems of space of direct methods because they are based on matrix-vector (M-V) multiplications.In general terms, a modern integral equation solver is the mix of a robust iterative method, a fast algorithm for computing cheap approximate M-V products, and an efficient preconditioner to speed-up the convergence.

The choice of the iterative method
Krylov methods are among the most popular accelerators because of their ability to deliver good rates of convergence and to handle very large problems efficiently.They look for the solution of the system Ax = b in the Krylov space K k (A, b)=span{b, Ab, A 2 b, ..., A k−1 b}.This is a good space from which to construct approximate solutions for a nonsingular linear system because it is intimately related to A −1 .In fact the inverse of any nonsingular matrix A can be written in terms of powers of A with the help of the minimal polynomial q(t) of A, which is the unique monic polynomial of minimal degree such that q(A)=0.If λ 1 , ..., λ d are the distinct eigenvalues of A, m j is the index of λ j , and we define m as Writing q(t) in the form ).An asterisk " * " indicates the fastest run.The problem is shown in Figure 2 we have This shows that, if the minimal polynomial of A has degree m, then the solution of Ax = b lies in the space K m (A, b).The smaller the degree of the minimal polynomial, the faster the expected rate of convergence of a Krylov method (see Ipsen & Meyer (1998)).
One issue is the choice of the suitable Krylov algorithm.Most integral formulations for surface and hybrid surface/volume scattering give rise to indefinite linear systems that cannot be solved using the Conjugate Gradient method (see discussions in Section 2).The GMRES method by Saad & Schultz (1986) is virtually always used for solving dense non-Hermitian linear systems as it is an optimal iterative solver, in the sense that it minimizes the 2-norm of the residual over the corresponding Krylov space.It generally requires the least number of iterations to converge.However, the optimality of GMRES comes at a price.The cost of applying the method increases with the iterations, and it may sometimes become prohibitively expensive for solving practical applications.As an attempt to limit the costs of GMRES, the algorithm is often restarted.After a given number of steps k, the approximate solution is computed from the generated Krylov subspace.Then the Krylov subspace is destroyed, and a new space is reconstructed using the latest residual.
On the other hand, non-optimal methods attempt to limit the costs of GMRES while preserving its favourable convergence properties.In Table 1, we show the number of iterations required by Krylov methods to reduce the initial residual to O(10 −8 ) starting from the zero vector on the problem shown in Figure 2.For simplicity, the right-hand side of the linear system is set up so that the initial solution is the vector of all ones.We do not use preconditioning.In addition to restarted GMRES, we consider complex versions of iterative algorithms based on Lanczos biorthogonalization, such as BiCGSTAB (van der Vorst (1992)) and QMR (Freund & Nachtigal (1994)) and on the recently developed Lanczos biconjugate A-orthonormalization, such as BiCOR and CORS (Carpentieri et al. (2011); Jing, Huang, Zhang, Li, Cheng, Ren, Duan, Sogabe & Carpentieri ( 2009)).We clearly observe the importance of the choice of the iterative method.In our experiments, the CORS method is the fastest non-Hermitian solver with respect to CPU time on most selected examples except GMRES with large restart.Indeed, unrestarted GMRES may outperform all other Krylov methods and should be used when memory is not a concern.However, reorthogonalization costs may penalize the GMRES convergence in large-scale applications, so using high values of restart may not be convenient (or even not affordable for the memory) as shown in Carpentieri et al. (2005).In Table 1 we select a value of 50 for the restart parameter.solution in the Krylov subspace K m (A, r 0 ) by applying a Petrov-Galerkin approach and imposing the residual be orthogonal to the constraints subspace A H K m A H , r * 0 ; the shadow residual r * 0 is chosen to be equal to r * 0 = Ar 0 .The basis vector representations for the subspaces K m (A, r 0 ) and A H K m A H , r * 0 are computed by means of the biconjugate A-Orthonormalization procedure.Starting from two vectors v 1 and w 1 chosen to satisfy the condition ω H 1 Av 1 , the method ideally builds up a pair of biconjugate A-orthonormal bases v j , j = 1, 2, . . ., m and w i , i = 1, 2, . . ., m, respectively for the dual Krylov subspaces K m (A; v 1 ) and K m (A H ; w 1 ), satisfying the condition ω H i Av j = δ i,j ,1 ≤ i, j ≤ m.We point the reader to Jing, Carpentieri & Huang (2009) for further experiments with iterative Krylov methods for surface integral equations.
A significant amount of work has been devoted in the last years to design fast algorithms that can reduce the O(n 2 ) computational complexity for the M-V product operation required at each step of a Krylov method, such as the Fast Multipole Method (FMM) (Greengard & Rokhlin (1987); Rokhlin (1990)), the panel clustering method (Hackbush & Nowak (1989)), the H-matrix approach (Hackbush (1999)), wavelet techniques (Alpert et al. (1993); Bond & Vavasis (1994)), the adaptive cross approximation method (Bebendorf (2000)), the impedance matrix localization method (Canning (1990)), the multilevel matrix decomposition algorithm (Michielssen & Boag (1996)) and others.In particular, the combination of iterative Krylov subspace solvers and FMM is a popular approach for solving integral equations.For Helmholtz and Maxwell problems, FMM algorithms enable to speedup M-V multiplications with boundary element matrices down to O(n log n) algorithmic and memory complexity depending on the problem and on the specific implementation, see e.g.Cheng et al. (2006); Darrigrand (2002); Darve & Havé (2004); Dembart & Epton (1994); Engheta et al. (1992); Song & Chew (1995); Tausch (2004).Two-level implementations of FMM can reduce the cost of the M-V product operation from O(n 2 ) to O(n 3/2 ), a three level algorithm down to O(n 4/3 ) and the Multilevel Fast Multipole Algorithm (MLFMA) to O(n log n).
Multipole techniques exploit the rapid decay of the Green's function and compute interactions amongst degrees of freedom in the mesh at different levels of accuracy depending on their physical distance.The 3D mesh of the object is partitioned recursively into boxes of roughly equal size until the size becomes small compared with the wavelength.The hierarchical partitioning of the object is typically represented using a tree-structured data called oct-tree (see Figure 3).Multipole coefficients are computed for all boxes starting from the smallest Fig. 3.The oct-tree data structure representation in the FMM algorithm.Each cube has up to eight children and one parent box except for the largest cube which encloses the whole domain.
ones, that are the leaves, and recursively for each parent cube in the tree by summing together multipole coefficients of its children.Interactions of degrees of freedom within one observation box and its close neighboring boxes are computed exactly using MoM; depending on the frequency, they generate between 1% and 2% of the entries of A. Interactions with boxes that are not neighbors of the observation box but whose parent in the oct-tree is a neighbor of the box parent are computed using FMM (see Figure 4).All other interactions are computed Fig. 4. Interactions in the multilevel FMM algorithm.Interactions for the gray boxes are computed directly.We denote by dashed lines cubes that are not neighbors of the cube itself but whose parent is a neighbor of the cube's parent.These interactions are computed using the FMM.All other interactions are computed hierarchically on a coarser level.
hierarchically on a coarser level by traversing the oct-tree.Multiple techniques have been efficiently implemented on distributed memory parallel computers proving to be scalable to several million discretization points, see for instance the FISC code developed at University of Illinois by Song & Chew (1998); Song et al. (1997;1998), the INRIA/EADS integral equation code AS_ELFIP by Sylvand (2002;2003), the Bilkent University code by Ergül & Gürel (2007;2008) and others.

Algebraic preconditioning for boundary integral equations
Krylov methods may converge very slowly in practice, mainly due to bad spectral conditioning of the linear system.Relation (4) implies that the dimension of the solution space, and therefore the convergence properties, are mostly dictated by the eigenvalue distribution of A. The spectral properties may vary noticeably depending on the integral operator as well as on object shape and material.Problems with cavities or open surfaces typically require more iterations to converge than closed objects of the same physical size, and nonuniform meshes often produce ill-conditioned MoM matrices.On EFIE, the iteration count of Krylov solvers may increase as O(n 0.5 ) when the number of unknowns n is related to the wavenumber, see for instance experiments reported in Song & Chew (1998), whereas on CFIE the number of iterations typically increases as O(n 0.25 ).
On the other hand, if preconditioning A by a nonsingular matrix M the eigenvalues of M −1 A fall into a few clusters, say t of them, whose diameters are small enough, then M −1 A behaves numerically like a matrix with t distinct eigenvalues.As a result, we would expect t iterations of a Krylov method to produce reasonably accurate approximations.The matrix M is called the preconditioner matrix; preconditioning can be applied from the left as M −1 Ax = M −1 b as well as from the right as AM −1 y = b with x = M −1 y.
Optimal analytic preconditioners have been proposed for surface integral equations, see e.g.Antoine et al. (2004); Christiansen & Nédélec (2002); Steinbach & Wendland (1998).But they are problem-dependent.In this study, we consider purely algebraic techniques which compute the preconditioner only using information contained in the coefficient matrix of the linear system.Although far from optimal for any specific problem, algebraic methods can be applied to different operators and to changes in the geometry only by tuning a few parameters, and may often be developed from existent public-domain software implementations.
We are interested to develop techniques that have O(n log n) algorithmic and memory complexity in the construction and in the application phase like FMM, and may be implemented efficiently within multipole codes.For memory concerns, we compute the preconditioner by initially decomposing the linear system in the form where S is a sparse matrix retaining the most relevant contributions to the singular integrals and is easy to invert, while B can be dense.If the continuous operator S underlying S is bounded and the operator B underlying B is compact, then S −1 B is compact and We may expect that the preconditioned system I + S −1 B x = S −1 b has a good clusterization of eigenvalues close to one, see e.g.Chen (1994) and (Chen, 2005, pp. 182-185).
The simplest approach to compute the local matrix S is to drop the small entries of A below a threshold (Cosnau (1996); Kolotilina (1988); Vavasis (1992)).When all the entries of A are not explicitly available, it may be necessary to use information extracted from the physical mesh of the problem.In an integral equation context, the surface of the object is discretized using a triangular mesh; each degree of freedom (DOF), or equivalently each unknown of the linear system, is associated to an edge of the mesh.Therefore, the sparsity pattern of S can be defined according to the concept of level k neighbours (see Figure 5(a)).Level 1 neighbours of a DOF are the DOF plus the four DOFs belonging to the two triangles that share the edge corresponding to the DOF itself.Level 2 neighbours are all the level 1 neighbours plus the DOFs in the triangles that are neighbours of the two triangles considered at level 1, and so forth.Due to the very localized nature of the Green's function, by retaining a few (two or three) levels of neighbours for each DOF an effective approximation may be constructed.
Comparative experiments show that there is little to choose.Both matrix-and mesh-based approaches can provide very good approximations S to the dense coefficient matrix for low sparsity ratio between 1% and 2% (Carpentieri et al. (2000)).The mesh-based approach is straightforward to implement in FMM codes as the object is typically partitioned using geometric information (see Figure 5

(b)). Multipole algorithms yield a matrix decomposition
where A diag is the block diagonal part of A associated with interactions of basis functions belonging to the same box, A near is the block near-diagonal part of A associated with interactions within one level of neighboring boxes (they are 8 in 2D and 26 in 3D), and A far is the far-field part of A. Therefore, in a multipole setting a suitable choice for the local matrix may be S = A diag + A near .

Comparison of standard preconditioners
To illustrate the difficulty of finding a good preconditioner for this problem class, in Table 2 we report one experiments with the GMRES solver and various algebraic preconditioners applied to a scattering problem from an open cylindric surface illuminated at 200 MHz of frequency and modeled using EFIE.The system has n = 1299 unknowns and is a low resolution testcase than the problem in Figure 2. In connection with GMRES, we consider preconditioners M of either implicit type (which approximately factorize S) or of explicit type (which approximately invert S) at roughly the same number of nonzero entries in M. We adopt the following acronyms: • None, means that no preconditioner is used; • Diag, a simple diagonal scaling, i.e.M is the diagonal of S; • SSOR, the symmetric successive overrelaxation method M = , where we denote by D the diagonal of S and E is the strict lower triangular part of S; • ILU(0) by Saad (1996), the lower/upper incomplete LU factorization M = L U, L ≈ L, U ≈ U, S = LU, where the sparsity pattern of L (resp.U) is equal to that of the lower (resp.upper) triangular part of S; • SPAI by Grote & Huckle (1997), an approximate inverse preconditioner M ≈ S −1 computed by minimizing I − SM F .The same pattern of S is imposed to M.
• AINV by Benzi et al. (1996), a sparse approximate inverse computed in factorized form by applying an incomplete biconjugation process to S, and dropping small entries below a threshold in the inverse factors.
Precond.GMRES(30) GMRES( 80  We see that many standard methods fail.Simple preconditioners, like the diagonal of A, diagonal blocks, or a band, may be effective when the coefficient matrix has some degree of diagonal dominance (Song et al. (1997)).For ill-conditioned and indefinite matrices, more robust methods are needed.Techniques that are successful for solving partial differential equations may be successfully adopted for integral equations; in the next section, we analyse some of these methods.

Sparse approximate inverses preconditioner
Approximate inverse methods are very attractive for parallelism.They explicitly compute and store an approximation of the inverse of the coefficient matrix M ≈ S −1 , which may be used as preconditioner by performing one or more sparse M-V products operations at each step of an iterative solver.As shown in Figure 6, due to the rapid decay of the Green's function the entries of A −1 may have a very similar structure to those of A, so that a very sparse preconditioner M may effectively capture the large contributions to the inverse.The actual entries of M may be computed by minimizing the error matrix I − SM F for right preconditioning ( I − MS F resp. left preconditioning).The Frobenius-norm allows to decouple the constrained minimization problem into n independent linear least-squares problems, one for each column (resp.row) of M when preconditioning from the right (resp.from the left).The independence of the least-squares problems can be immediately seen from the identity where e j is the jth canonical unit vector and m •j is the column vector representing the jth column of M. In the case of right preconditioning, the analogous relation holds, where m j• is the column vector representing the jth row of M. The preconditioner is not guaranteed to be nonsingular in general, and additionally it does not preserve any possible symmetry of A. The condition to ensure non-singularity of M may be derived from the following estimates of the accuracy of the approximate inverse (Grote & Huckle (1997)): 165 Fast Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering www.intechopen.comTHEOREM 1.Let r j = Sm j − e j be the residual associated with column m j for j = 1, 2, . . ., n, and q = max 1≤j≤n nnz r j ≪ n.Suppose that r j 2 < t for j = 1, 2, . . ., n, then we have Owing to this result, all the eigenvalues of SM lie in the disk centered in 1 and of radius √ qt; the value of q is not known a priori, though, so that one might enforce the condition √ nt < 1 to prevent singularity or near-singularity of the preconditioned matrix.In practice it may be too costly to compute M with such a small t.For some problems, it may be observed a lack of robustness of the approximate inverse due to the clustering of small eigenvalues in the spectrum of the preconditioned matrix.Stabilization techniques based on eigenvalue deflation may be used to enhance the robustness of M, see e.g.Carpentieri et al. (2003).
The most critical component is the computation of the nonzero structure of M. From Figure 6, we see that the sparse pattern of S may be a suitable pattern for M. Denoting by the nonzero structure of the approximate inverse, we may automatically determine the pattern of the nonzero entries of the jth column of M as and compute the associated entries by solving a small size dense least-squares problem.The least-squares solution involves only those columns of S indexed by C j ; we indicate this subset by S(:, C j ).Because S is sparse, many rows in S(:, C j ) are usually null, not affecting the solution of the least-squares problems (7).Thus denoting by R j the set of indices corresponding to the nonzero rows in S(:, C j ),b y S = S(R j , C j ),b y m j = m j (C j ), and by e j = e j (C j ), the actual "reduced" least-squares problems to solve are min e j − S m j 2 , j = 1, .., n.
Usually problems (9) have much smaller size than problems (7) and can be efficiently solved by dense QR factorization.The parallel implementation of the approximate inverse is highly scalable as shown in Table 3, while the numerical performance typically tend to deteriorate for increasing matrix size as can be seen in Table 4.
Approximate inverses may be also computed in factorized form as M = G Z, where G ≈ U −1 and Z ≈ U −1 are approximation of the inverse triangular factors of S, see for instance Alléon, Benzi & Giraud (1997); Chen (1998); Rahola (1998); Samant et al. (1996).One example of such preconditioner is the AINV method by Benzi et al. (1996), a sparse approximate inverse computed in factorized form by applying an incomplete biconjugation process to S and dropping small entries below a threshold in the inverse factors.However, disappointing results with factorized approximate inverses have been reported on this problem class, see e.g.Carpentieri et al. (2004).The reason of failure is that for many integral formulations like EFIE and CFIE, the inverse factors may be totally unstructured.a priori the sparse pattern for the factors can be extremely hard and dynamic pattern selection strategies, that drop small entries below a user-defined threshold during the computation, may be very difficult to tune as they can easily discard relevant information and lead to a very poor preconditioner.For those problems, finding the appropriate threshold to enable a good trade-off between sparsity and numerical efficiency is challenging and very problem-dependent.

Incomplete LU factorization preconditioner
ILU-type methods compute an approximate triangular decomposition of S by means of an incomplete Gaussian elimination process.The ILU preconditioner writes as M = L U, L ≈ L, U ≈ U where L and U denote respectively the lower and upper triangular factors of the standard LU factorization of S. This class of methods is virtually always used for solving sparse linear systems.However, mixed success is reported on dense matrix problems, due to the indefiniteness of the systems arising from the discretization.The root of the problem is that small pivots often appear during the factorization, leading to highly ill-conditioned triangular factors and unstable triangular solves (Carpentieri et al. (2004)).
In Table 5 we show an experiment with an ILU preconditioner computed from the sparse approximation S to A, using different values of density for S. The test case is a sphere of 1 meter length illuminated at 300 MHz; the problem is modeled using EFIE and the mesh is discretized with 2430 edges.The set F of fill-in entries to be kept for the approximate lower triangular factor L is defined by where the integer ℓ denotes a user specified maximal fill-in level.The level lev(l k,i ) of the coefficient l k,i of L is computed as follows: 167 Fast Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering Observe that the larger ℓ, the higher the density of the preconditioner.We denote the resulting preconditioner by ILU(ℓ) Saad (1996).
In our results, increasing the fill-in parameter may produce much more robust preconditioners than ILU(0) applied to a denser sparse approximation of the original matrix; ILU(1) may deliver a good rate of convergence provided the coefficient matrix is not too sparse.However, the factorization of a very sparse approximation (up to 2%) of the coefficient matrix can be stable and accelerate significantly the convergence, especially if at least one level of fill-in is retained.Then, for higher values of the density of S the factors may become progressively ill-conditioned, the triangular solves unstable and consequently the preconditioner is useless.
The table also shows that ill-conditioning of the factors is not related to ill-conditioning of A.

Density of S =2%
IC(level) Density of L κ ∞ (L) GMRES( 30) GMRES( 50) IC( 0 The symbol '-' means that convergence was not obtained after 500 iterations. A complex diagonal compensation can help to compute a more stable preconditioner, by shifting along the imaginary axis the eigenvalues close to zero in the spectrum of the coefficient matrix.However, the value of the shift is not easy to tune a priori and its effect on the convergence is difficult to predict (Carpentieri et al. (2004)).Pivoting may be a more robust approach to overcome the problem according to reported experiment by Malas & Gürel (2007); in this case, the ith row of the factor is computed as soon as permtol × s ij > |s ii |, where permtol is the permutation tolerance and s ij are the entries of S.
We follow a different approach.We report on experiments with multilevel inverse-based ILU factorization methods to possibly remedy numerical instabilities.Following Bollhöfer & Saad (2006), we initially rescale and reorder the initial matrix A as which yields Â x = b for appropriate x, b.The initial step may consist of an optional maximum weight matching (Duff & Koster (1999)).By rescaling and a one-sided permutation, it attempts to improve the diagonal dominance.After that, a symmetric reordering is applied to reduce the fill-/bandwidth.The latter can also be used without an a priori matching step, only rescaling the entries and symmetrically permuting the rows and the columns.This is of particular interest for (almost) symmetrically structured problems.Next, an inverse-based ILU with static diagonal pivoting is computed.I.e., during the approximate incomplete factorization Â ≈ LDU such that L, U H are unit lower triangular factors and D is block diagonal, the norms L −1 , U −1 are estimated.If at factorization step l a prescribed bound κ is exceeded, the current row l and column l are permuted to the lower right end of the matrix.Otherwise the approximate factorization is continued.One single pass leads to an approximate partial factorization with a suitable leading block B and a suitable permutation matrix Π, where L −1 1 ≤κ, U −1 1 ≤κ.The remaining system S C approximates C − EB −1 F. From the relations at each step of an iterative solver we need to store and invert only blocks with B and S C ≈ C − EB −1 F while for reasons of memory efficieny, L E , U F are discarded and implicitly represented via When the scaling, preordering and the factorization is successively applied to S C , a multilevel variant of (10) is computed.E.g., after a one additional level we obtain The multilevel algorithm ends at some step m when either S C is factored completely or it becomes considerably dense and switches to a dense LAPACK solver.After computing an m-step ILU decomposition, for preconditioning we have to apply L  5.

Concluding remarks
We have discussed some fast iterative solution techniques for solving surface boundary integral equations.High-frequency simulations of large structures are extremely demanding for scalable solvers and large computing resources.We have reviewed recent advances for the class of Krylov subspace methods, sparse approximate inverses, incomplete LU factorizations.
Other approach have been applied in this area of research.Multigrid methods are provably optimal algorithms for solving various classes of partial differential equations.Attempts to apply these techniques to dense linear systems have obtained mixed success.Early experiments on boundary element equations are reported with geometric versions on simple model problems, typically the hypersingular and single-layer potential integral operators arising from the Laplace equation (Bramble et al. (1994); Petersdorff & Stephan (1992); Rjasanow (1987)).Multigrids require a hierarchy of nested meshes to setup the principal components of the algorithm, i.e. a coarsening strategy to decrease the number of unknowns, grid transfer operators to move from a grid to another one, coarse grid operators and smoothing procedure, see e.g.Hackbusch (1985).Thus they are difficult to implement.On the other hand, algebraic multigrid algorithms use only single grid information extracted from either the graph or the entries of the coefficient matrix and are nearly as effective as geometric algorithms in reducing the number of iterations, see e.g Braess (1995); Brandt (1999); Ruge & Stüben (1987);Vanek et al. (1996).Langer et al. propose to apply an auxiliary sparse matrix reflecting the local topology of the mesh on the fine grid to setup all the components of the multigrid algorithm in a purely algebraic setting (Langer et al. (2003)).This gray-box approach is fairly robust on model problems and maintains the algorithmic and memory complexity of the M-V product operation (Langer & Pusch (2005)), thus it is well suited to be combined with MLFMA.See also Carpentieri et al. (2007) for another multigrid-type solver.
Preconditioners based on wavelet techniques are also receiving interest.The wavelet compression of integral operators with smooth kernels yields nearly sparse matrices with at most O(n log a n) nonzero entries, where a is a small constant that depends on the operator and the wavelet used, see e.g. earlier work by Beylkin et al. (1991); Dahmen et al. (1993); Harbrecht & Schneider (2004); Hawkins et al. (2007); Lage & Schwab (1999).The compressed matrix is spectrally equivalent to the original matrix and preconditioning is often needed (Chan & Chen (2000;2002); Chan et al. (1997); Chen (1999); Ford & Chen (2001); Hawkins & Chen (2005); Hawkins et al. (2005)).Some efficient wavelet preconditioning algorithms have been proposed, based on bordered block structure (Ford & Chen (2001); Hawkins et al. (2005)), multi-level preconditioners (Chan & Chen (2002)), and sparse approximate inverses.However, most experiments with wavelet preconditioners are reported for model problems, e.g.Calderon-Zygmund type matrix, single and double layer potentials, the hyper-singular operator.For oscillatory kernels the compressed matrix may be fairly dense and wavelet techniques are less useful.For Helmholtz problems, wavelet Galerkin schemes yield matrices with approximately O(kn) (k is the wavenumber) which becomes O(n 2 ) when the number of unknowns is related to k.
Further investigations are necessary to identify the best class of methods for the given problem and the selected computer hardware.The use of more powerful (but also more complex) computing facilities should help in the search for additional speed, but it will also mean that there will be even more factors that need to be considered when attempting to identify the optimal approach in the future.

Fig. 1 .
Fig. 1.Example of surface discretization in an integral equation context.Each unknown of the problem is associated to an edge in the mesh.Courtesy of the EMC-CERFACS Group in Toulouse.

157Fast
Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering www.intechopen.comwhere A = A ij and b = [b i ] have elements, respectively,

160
Trends in Electromagnetism -From Fundamentals to Applications www.intechopen.comFast Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering 7

Fig. 5 .
Fig. 5. Mesh-based pattern selection strategies to compute local interactions in an integral equation context.

Fig. 6 .
Fig.6.Structure of the large entries of A (on the left) and of A −1 (on the right).Large to small entries are depicted in different colors, from red to green, yellow and blue.The test problem is a small sphere.

Table 2 .
Number of iterations using GMRES and various preconditioners on a test problem, a cylinder with an open surface, discretized with n = 1299 edges.The tolerance is set to 10 −8 .The symbol '-' means that no convergence was achieved after 1000 iterations.The results are for right preconditioning.

Table 3 .
Parallel scalability of the approximate inverse for solving large-scale boundary integral equations on a model problem.

Table 4 .
Numerical scalability of the approximate inverse for solving large-scale boundary integral equations.The symbol • means run on 32 processors.
Notation: m means minutes, h hours.

Table 5 .
Number of iterations of GMRES varying the sparsity level of S and the level of fill-in of the approximate factor L on a spherical model problem (n = 2430, κ ∞

Table 6 .
Monitoring the growth of these two quantities during the partial factorization 169 Fast Preconditioned Krylov Methods for Boundary Integral Equations in Electromagnetic Scattering www.intechopen.com is essential to preserve the numerical stability of the solver, as can be observed comparing results in Table5 and Table 6.Number of iterations of GMRES using a multilevel inverse-based ILU factorization as preconditioner.The model problem is the same as in Table