Wavelets for Differential Equations and Numerical Operator Calculus

Differential equations are commonplace in engineering, and lots of research have been carried out in developing methods, both efficient and precise, for their numerical solution. Nowadays the numerical practitioner can rely on a wide range of tools for solving differential equations: finite difference methods, finite element methods, meshless, and so on. Wavelets, since their appearance in the early 1990s, have attracted attention for their multiresolution nature that allows them to act as a “ mathematical zoom, ” a characteristic that promises to describe efficiently the functions involved in the differential equation, especially in the presence of singu-larities. The objective of this chapter is to introduce the main concepts of wavelets and differential equation, allowing the reader to apply wavelets to the solution of differential equations and in numerical operator calculus.


Introduction
Partial differential equations (PDEs) are used commonplace in science and in engineering to model the behavior of physical systems. Because of their importance, many numerical techniques for their solutions have been developed: finite difference methods (FDMs), finite element methods (FEMs), spectral methods, Ritz/ Galerkin approach, meshless approaches, and so on. The main characteristic of PDEs is that the "unknown" is a function, that is, an object with an infinite number of degrees of freedom. Because of this, it is usually impossible (even in principle) to get an exact solution by numerical means. The objective of every technique for PDE solving is to get a good approximation of the solution with limited computational resources (CPU time, memory, etc.).
The first step of every PDE solution algorithm is to discretize the PDE, that is, to approximate it with a finite-dimensional problem that can be solved by numerical means. A popular discretization technique is to discretize the space where the solution is searched by restricting the problem to a finite-dimensional vector space, that is, by writing the solution as the linear combination of several base functions. If the original PDE was linear, discretization will map it to a linear problem (typically a linear system or an eigenvalue problem). Even techniques such as FDM (that discretizes the domain) can be often reformulated as suitable discretization of the function space.
Intuitively, better approximations of the solution require finer discretization (the exact meaning of finer depends on the specific approach), especially if the solution has some regions of large variability. Since finer discretization implies larger problems (and, therefore, higher computational efforts), it is of interest to be able to change locally the discretization resolution to the solution variability, possibly in an adaptive way.
This need for different resolutions in different regions is the idea that links PDE with multiresolution analysis. The birth of multiresolution analysis goes back to 1990 with the works of Mallat [1] and Meyer [2]. Since then there have been a large number of papers ranging from very theoretical ones to application [3]. In a multiresolution analysis, a space of signals (most commonly L 2 IR ð Þ, but not only) is represented as a nesting of spaces with different levels of "resolution." This allows to write a signal as the sum of a "low-resolution" version plus some higherresolution "details." Because of the ability of changing the resolution used to observe the signal (by adding or removing details), the multiresolution analysis is sometimes described as a mathematical zoom. This fact inspired many applications, including numerical solution of PDEs where they sound promising, especially for those problems that contains localized phenomena (e.g., shockwaves) or intra-scale interaction (e.g., turbulence).
The objective of this chapter is to introduce the reader to the application of wavelets to PDE solutions. This chapter can be ideally divided in three parts: in the first part, we recall briefly the main concepts about PDE and the main algorithms for solving PDE; successively we do a brief recall of multiresolution analysis and wavelets including also multiwavelets and second-generation wavelets that find often application in PDE solutions; and finally, we will illustrate few techniques that can be found in the literature.

Notation
Ω ⊆ IR d is the domain where the functions of interest are defined. The boundary of Ω is partitioned as follows: H Ω ð Þ will denote a space of functions u : Ω ! IR defined on Ω ⊆ IR d . IR ≥ 0 is the set of nonnegative reals.

Generalities on PDE
Most of the physical problems modeled by PDEs fall in one of the following three large classes: equilibrium, propagation, or eigenvalue problems.
• In an equilibrium problem, we are interested in finding a function u ∈ H Ω ð Þ such that In (1)-(3) D is an operator that includes derivatives and f ∈ H Ω ð Þ is known. Boundary conditions are typically given as constraint about u or its derivatives on regions of ∂Ω. Typical examples of physical systems giving rise to this type of problem are systems in steady state (e.g., temperature distribution, potential distribution, steady flows, and so on). Typically equilibrium problems are elliptic, that is, D is an elliptic operator (a generalization of the Laplacian).
• In a propagation problem, we are interested in modeling the time evolution of a physical system. The PDE can still be written as in (1)- (3), but the domain can typically be written as Ω ¼ IR ≥ 0 Â W, where W ⊆ IR dÀ1 and the first coordinate represents time. Boundary conditions for t ¼ 0 are known as initial conditions. Example physical problems are heat or wave propagation. Propagation problems are typically hyperbolic or parabolic.
• Finally, in eigenvalue problems we are interested in finding u and λ that satisfy A wide class of eigenvalue problems is represented by Sturm-Liouville problems that can be written as where the apostrophe denotes derivation, the unknowns are λ, and function is y ∈ H a; b ½ ð Þ, while p, q, and w, all belonging to H a; b ½ ð Þ, are known. Sturm-Liouville problems include Bessel differential equations (obtained by writing Laplace, Helmholtz, or Schrodinger equation in cylindrical coordinates) and Lagrange differential equation (obtained working in spherical coordinates).

Solution of differential equations
The field of numerical solution of differential equations is very wide, and many techniques have been developed. Nevertheless, a categorization in few large classes is possible. An important step in every solution algorithm is mapping the differential equation into a discrete version with only a finite number of degrees of freedom. A first distinction can be done between techniques that achieve this objective by discretizing the domain Ω or the function space H Ω ð Þ.

Domain discretization
The most known technique based on a domain discretization is the FDM where the unknown function is sampled in a finite number of points p 1 , p 2 , …, p N ∈ Ω and the derivatives are approximated with finite differences. By writing the differential equation for every p, with the derivative approximated as finite differences, one obtains a system of N equations in N unknowns that can be solved with known techniques. If the original PDE was linear, the discretized system will be linear too.
FDM is maybe the simplest approach and the most intuitive, and it can work quite well for simple problems and geometries. Moreover, in the linear case, since any approximation of a derivative in p will consider only few points around p, the matrix of the discretized linear system will be very sparse, allowing for a reduction in the computational effort. The application of FDM techniques becomes difficult, albeit possible, in the case of complex problems.

Function space discretization
Another class of techniques discretizes the function space H Ω ð Þ by approximating it with an n-dimensional space H n , that is, unknown function u is approximated as where b i f g n i¼1 is a basis of H n . By exploiting approximation (6), one can transform PDE (1)-(3) into a finitedimensional problem. The different solution techniques differ in how (6) is used and in the way of choosing space H n and its basis b i f g n i¼1 . One possibility is to choose functions b i that are infinitely differentiable and nonvanishing on the whole Ω. This gives rise to so-called spectral methods. Typical choices for basis functions can be complex exponential/sinusoidal functions (if the solution is expected to be periodic), Chebyshev polynomials (for separable domains, e.g., d-dimensional cubes), and spherical harmonics (for systems with spherical symmetry). Spectral methods can work very well if the solution is expected to be smooth; they can even converge exponentially fast. However, their spatial localization is not good, and if the functions involved are not smooth (e.g., they are discontinuous), they lose most of their interest.
Another approach, very popular, is FEM that chooses functions b i by first partitioning the domain Ω into a set of elements (triangles and their multidimensional counterpart are a popular choice) and assigning to every element a suitable finite-dimensional vector space. The final approximation of u is constructed in a piecewise fashion by gluing, so to say, the approximations of u over every single element.
In a typical implementation of FEM, all the elements are affine images of a single reference element. This simplifies the implementation since it suffices to choose only the vector space of the reference element T 0 . Another popular choice is to choose the space associated to the elements as spaces of polynomials. The basis is selected by choosing a set of control points in q 1 , q 2 , … ∈ T 0 and choosing as basis vectors b i the polynomials that satisfy the interpolation property Remark 2.1 (generalized collocation method). A generalization of this idea is to choose a set of functionals σ j mapping functions defined over T 0 to IR and requiring Eq. (8) gives back (7) if σ j is defined as the functional that corresponds to evaluating the argument of the functional in q j . Eq. (8) is, however, more general than (7) since it can be used, for example, to control the flow through a face of the element.
An issue with FEM is that creating the grid of elements can be expensive. This is especially true in those problems where the geometry is not fixed but needs to be updated. An example of this type of system is free-surface fluid flows, where the interface between air and fluid changes with time, requiring a continuous update of the mesh. In order to solve this problem, meshless methods have been developed [4]. A typical meshless approach is to approximate u with a discrete convolution with kernel as a chosen function φ, that is, where X I are a set of points of Ω and ρ is a scale factor that allows to change the "resolution" of kernel function ϕ. Coefficient C ρ can be used to keep the energy constant as ρ is changed.

Exploiting the discretization
After expressing u as linear combination of b i , we are left with the problem of determining the coefficients of the linear combination. Several approaches are possible; the easiest way to briefly present them is by rewriting the differential equation as If we restrict u to be a linear combination of b i , most probably we will not be able to make residual (10) exactly zero; therefore, we will aim to make it as small as possible. Since the result of the residual operator is a function, there are many possible approaches in minimizing it.
With the collocation approach, we choose a number of points of the domain p 1 , p 2 , …, p n ∈ Ω and ask that the residual is zero on the chosen points, that is, Eq. (11) represents a system of n equations having as unknown the coefficients where, clearly, A j, i ¼ Db i ½ p j . Note that (12) is a linear system in unknowns α i .

Remark 2.2.
With reference to Remark 2.1, one can generalize the collocation method by using a set of linear functionals σ j : H Ω ð Þ ! IR. In this case one can obtain a generalized version of (12), namely, Another approach is to solve Ru ¼ 0 in a least square sense, that is, to search for coefficients α i that minimize Standard algebra allows to show that (14) is minimized when Ru is orthogonal to ∂Ru=∂α i for every i, that is, If D is linear, and we get which is still a linear system. The Galerkin method is inspired on the idea that in a least square approximation, the error is orthogonal to the space where the approximating function lives. We would like to approximate the solution of the PDE with a vector of H n ; however, we do not know the solution, so we ask for the residual to be orthogonal to H n , that is, which can be interpreted as the original differential equation Du ¼ f in weak form. Form (19) is often exploited by integrating by parts the left-hand side scalar product, moving one differentiation from the unknown function u to the test function v. This is often useful when a piecewise linear approximation is employed and D contains second-order differential operators (that cannot be applied on piecewise linear functions). Eq. (19) is verified for all v ∈ H n if and only if it is verified for every vector in a basis of H n , that is, (19) is equivalent to If D is linear, from (20) one can easily derive the linear system in α i Finally, it is worth citing the method of weighted residuals that can be seen as a generalization of the Galerkin PDE method. The idea is that instead of asking the residual being orthogonal to the space H n used to approximate u, we ask the residual to be orthogonal to a different n-dimensional space K n ¼ span β 1 ; …; β n f gwhere β i f g n i¼1 is clearly a basis of K n . One obtains It is worth observing that from the weighted residual method, Galerkin and least square methods can be derived by a suitable choice of β i ; even collocation method can be derived if we allow β i to be a delta function (so that the scalar product needs to be interpreted as a distribution pairing). Moreover, for every v since map x↦ x; v h i is a functional, it is easy to recognize that every method can be considered like a generalized collocation method, as described in Remark 2.1.

Wavelets
The idea of multiresolution analysis is to approximate vectors of L 2 IR ð Þ with variable degrees of resolution. This is achieved through a multiresolution analysis scheme defined by means of some axioms. The first axiom is the existence of a sequence V n f g n ∈ Z of subspaces of L 2 IR ð Þ nested one inside the other, that is, The idea is that if one approximates (in a least square sense) a function f with vectors belonging to V n , the approximation error gets smaller as n increases since every vector of V n also belongs to V nþ1 . Note, however, that (23) does not grant that we will be able to approximate f with an error as small as desired; in order to grant this, we need another axiom where the overline denotes set closure (in the topology induced by the norm on L 2 IR ð Þ). Axiom (24) requires that every vector of L 2 IR ð Þ is in the closure of the union in the left hand; this means that given any ϵ . 0 and f ∈ L 2 IR ð Þ, it is possible to find an element of the union whose distance from f is less than ϵ. In other words, (24) means that whatever f ∈ L 2 IR ð Þ and whatever the chosen maximum approximation error allowed ϵ, one can find a space V n that approximates f with the required precision.
An axiom dual to (24) is that requires that there is only one "lowest resolution vector," that is, the null vector.
In order to see that axiom (24) is not obvious, it is more convenient to work with Hilbert space L 2 0; 1 ½ ð Þ. Recall that functions x↦ cos 2πnx ð Þ, x↦ sin 2πnx ð Þ, and n ∈ IN and the constant 1 are an orthogonal basis of g as the set of all the even-numbered sines, and define V 0 as the space generated by S 0 , that is, Now define spaces V n , n , 0 by removing one vector at time from the basis of V 0 , and define spaces V n , n , 0 by adding one odd harmonic at time. More precisely, define It is clear that the sequence of spaces defined in this way satisfies axiom (23), but not (24), since, for example, function cos 2πt ð Þ is orthogonal to every V n . Note that this construction can be repeated for any Hilbert space using an orthonormal basis of the space instead of sines and cosines. Another axiom makes more precise the idea of "increasing resolution" by asking that vectors in V n vary twice as faster than the vectors in V nÀ1 . In order to make this more precise, define operator S : It follows that V n ¼ S n V 0 . With this position one can interpret (25) by saying that the slowest function is the constant (and the only constant in L 2 IR ð Þ is the zero).
The last axiom puts a constraint on the structure of V 0 by asking that is generated by a function ϕ and its translations. In order to make this more precise, define the operator τ t : Note that also τ t is unitary and that the exponential notation is convenient since τ a τ b ¼ τ aþb . Observe also the commutation relation Sτ t ¼ τ t=2 S. The last axiom can be written as Often as part of axioms, it is required that ϕ is orthogonal to its translations, that is, However, it is not necessary to include (31) explicitly in the axioms since, given a ϕ that satisfies (30), it is possible to orthonormalize it, so that it satisfies (31), with a well-known "Fourier trick" [3]. Therefore, we will suppose (31) satisfied.
It is worth to summarize here the axioms The axioms above allow us to determine a property of ϕ. Note that since is an orthonormal basis of V 1 . It follows that one can write ϕ as linear combination of Sϕ and its half-integer translations, that is, for some sequence g i : Z ! IR. Eq. (37) is known as two-scale equation and it is central to wavelet theory. Function ϕ is known as scaling function.
Note that from the orthonormality of τ i=2 Sϕi ∈ Z È É follows where the left-hand side scalar product is the usual scalar product in ℓ 2 Z ð Þ. Remark 3.3. Note that starting from a ϕ that satisfies a two-scale equation like (37), it is possible to recover a full multiresolution analysis. Indeed, one defines V 0 according to (36) and V n by repeated applications of (35). Two-scale Eq. (37) grants that the nesting axiom (32) is satisfied.
Note also that (37) shows that ϕ is the fixed point of operator This suggests that maybe one could start from a sequence g i and apply repeatedly O to a vector of L 2 IR ð Þ in order to obtain ϕ. This is indeed possible, but the theoretical details are out of scope here; see [5].
Since V nþ1 ⊃ V n one can consider the orthogonal complement of V n in V nþ1 ; call it W n , that is, It is possible to find, starting from the two-scale Eq. (37), a function ψ such that τ i ψ È É i ∈ Z is an orthonormal basis of W 0 . This implies that it must be for all By using (41) and the orthonormality of τ i=2 Sϕ, it is possible to rewrite (42) and (43) as conditions on h i , namely, It is easy to verify that, given g i , a possible h i that satisfies (44) and (45) is This shows that sequences h i and g i are the impulse responses of a two-channel orthogonal filter bank. Moreover, if g i and ϕ are known, one can obtain ψ by choosing h i according to (46) and computing ψ according to (41). Function ψ is known as wavelet, and it generates the whole L 2 IR ð Þ with its translations and dilations.
This has also another interesting consequence. Suppose f ∈ L 2 IR ð Þ and that are the coefficients of its projection on V 1 . Suppose we need the coefficients γ of the projection on V 0 . It is possible to exploit the two-scale equation where g À is the time-reversed version of g i . Eq. (48) shows that it is possible to go to the space at lower resolution by means of a filtering by g À and a decimation by a factor of two. Similarly, by calling η the coefficients relative to the projection of f on W 0 , one can obtain Figure 1a shows this idea: the sequence of high-resolution coefficients are processed with a two-channel orthogonal filter bank, and the coefficients relative to the lower resolution space V n exit from one branch, and the coefficients relative to the "missing details" space W n exit from the other. The idea can be iterated several times; see Figure 1b. This is the basis of the well-known fast algorithm to compute wavelet coefficients and also the origin of the minor, and very common, misnomer in calling Figure 1b a "discrete-time wavelet transform." An interesting characteristic of wavelets is that they can be used to detect the local regularity of a function. This is similar to what happened with Fourier transform where a function that is discontinuous has a Fourier transform that decays as 1=ω; if the function is continuous but not derivable, its Fourier transform decays as 1=ω 2 and so on. With the wavelet transform happens something similar, with the scale playing the role of frequency. The interesting difference is that while a Fourier transform that decays as 1=ω tells us that there is at least one discontinuity, but not where, with the wavelet transform the slow decay with the scale is localized around the discontinuity. The precise claim of this property requires the introduction of the concept of Lipschitz regularity and would take us too far; see [6]. This suggests that when approximating the unknown function in a PDE, we can keep high-resolution coefficients only in the neighborhood of singularities, saving on computational effort.
We will say that wavelet ψ has ℓ vanishing moments if An interesting property of compactly supported wavelets with ℓ vanishing moments is that the corresponding scaling function (not the wavelet itself) can reproduce polynomials of degree at most ℓ À 1 in the sense that if P x ð Þ is a polynomial with degree less than ℓ, there exist coefficients c i such that In other words, V 0 contains all the polynomials of degree less than ℓ. Example 3.1. (Haar wavelet). The simplest example of wavelet is the Haar wavelet whose scaling function is It is immediate to verify that ϕ H satisfies a two-scale equation In order to create the corresponding wavelet, use prescription h i ¼ À1 ð Þ i g Àiþ1 to get Note that the Haar wavelet is compactly supported, but it is discontinuous. This makes it not well suited to approximate smooth functions.
Example 3.2 (Sinc wavelet). An example in some sense opposite to the Haar wavelet is the Sinc wavelet. In this case V 0 is the space of "low-pass" functions, that is, functions whose Fourier transform is zero outside interval Àπ; π ½ . As well known, V 0 is generated by the Sinc function  and its translations, that is, This suggests to use ϕ S ¼ sinc as scaling function. The fact that a two-scale equation is satisfied is easily checked in frequency since V 1 is the space of functions whose Fourier transform is zero outside À2π; 2π ½ ; therefore, every function of V 0 is contained in V 1 , as desired.
The corresponding wavelet ψ S is easily characterized in frequency as the function whose Fourier transform is It is easy to verify that ψ S ∈ V 1 , ψ S ⊥V 1 , and τ 2k ψ S ; ψ S ¼ δ k . As said above, the Sinc example is somehow the opposite of Haar wavelet: it is arbitrarily differentiable, but it has infinite support; actually, it decays very slowly (as O 1=x ð Þ), and this introduces several practical issues. Moreover, sequences g i and h i are of infinite length, and they decay slowly too (they do not even have a ztransform), making it difficult to implement it.
It is easy to see that V that is, the rect convolved with itself d times. Function rect * d ð Þ has compact support but it is not, however, orthogonal to its own translations. It can be orthogonalized with the Fourier trick, but the result has no compact support. For more details about this case, see [3].

Compactly supported wavelets: Daubechies' wavelets
In the examples above, we found multiresolution analysis whose scaling function had at most two out of the following three desirable characteristics: orthogonality, smoothness, and compact support. Is it possible to find a wavelet that has all these three characteristics? The answer was given by Daubechies. It turns out that imposing all the three characteristics is very demanding and only a small family of wavelets exists.
An easy observation is that if ϕ is orthogonal to its own translations, the coefficients g i in the two-scale equation can be obtained as According to (60) if ϕ has compact support, then g i has a finite number of coefficients that are different from zero. Since g i needs to be orthogonal to its even translations, its length (i.e., the number of nonzero coefficients) must be necessarily even [3].
Moreover, if the iteration of operator O in (39) converges, it is easy to see that g i has a finite number of coefficients, and then the limit function has compact support. This suggests that it "suffices" to find a finite length sequence g i that is orthogonal to its own even translation and iterates operator O to obtain the desired scaling functions. It actually turns out that this can be done, although there are lots of technical details to be taken care of (e.g., about convergence of O k and smoothness of the resulting ϕ); see [5] for details.
Every member of the Daubechies family is identified by the length 2N of the sequence g i (remember that the length of g i is necessarily even). It can be proven that the resulting scaling function has N vanishing moments and its smoothness grows with N; see Table 1. See also Figure 2 that shows the results of the first three iterations of O (first row), the final scaling function (second row), and the wavelet (third row) of three different Daubechies wavelets.

Extensions
The construction given above is the original idea of multiresolution analysis. Since the early 1990s, many researchers worked in this field, and many variations and extensions have been introduced. Here we briefly recall those that have more interest in the field of differential equation solutions.

Multiwavelets
Multiwavelets are a generalization of standard multiresolution analysis in the sense that now scaling functions and wavelets are vectors of functions. This means that V n is not generated by the translations of a single function but from the translations of many functions. Every idea of standard multiresolution analysis can be reformulated without much difficulty in this case, with the most notable difference that the two-scale equation now has vector function and coefficients that are matrices. Multiwavelets can accommodate scaling factors different from two and there is a larger choice for compact support wavelets. See [7] for more details.

Second-generation wavelets
Two-scale Eq. (37) and the resulting filter bank-based procedure work well when the data are sampled on a regular grid and/or the functions of interest are defined on IR d . Since there are many applications that do not satisfy this requirement (e.g., differential equations on general manifolds), the idea of secondgeneration wavelet has been introduced.
The starting point is the so-called lifting form of filter bank (Figure 1). It is possible to show that any two-channel filter bank (Figure 1) can be implemented as shown in Figure 3. In the lifting approach, the input signal is split into odd and even samples by a serial-to-parallel converter. The first branch is filtered, and the result combined with the other branch; the result of this operation is filtered again and combined with the first branch, and this iterated as long as necessary. Filter P is  sometimes called the prediction step, and it is interpreted as a filter that predicts the odd samples from the even ones; filter U is sometimes called update.
The advantage of this form is that, being similar to the Feistel structure used in cryptography [8], it is exactly invertible even if operations are implemented in fixed-point arithmetic. Actually, the invertibility does not depend on the detail of P i and U i ; that can be anything, even nonlinear. Another interesting advantage of this predict/update idea is that it does not require a regular domain, allowing to bring the wavelet concept to more general contexts. For example, [9] uses this idea to solve differential equations on the sphere.

Some examples of application of wavelets to PDE
By now it should be clear how multiresolution analysis can be applied to differential equation solution: by using scaling functions and/or wavelets as basis functions in approximation (6). All the approaches described in Section 2 can be used with wavelet: collocation, Galerkin PDE method, weighted residual method, meshless methods, etc. Before describing some details of few approaches described in the literature, it is worth to do some general remarks.
What makes wavelet interesting is their multiresolution property and the fact that a wisely chosen wavelet (smooth and/or with many vanishing gradients) has interesting "singularity sensing" properties: in the neighborhood of a singularity (discontinuity, nondifferentiability, etc.), the coefficients decay as a function of scale with a speed that depends on the singularity involved (similar to what Fourier transform does, only on a local level), but away from the singularity, they decade fast [3]. This implies that good approximations can be obtained with few coefficients, using high-resolution decomposition only where it is necessary, reducing the size of the matrices involved in the solution of the PDE. A similar effect can be obtained, for example, in FEM by using a finer mesh around points of large variation. However, using this approach in an adaptive way would require to adjust at running time the mesh, a potentially heavy operation. Wavelets have the potential of employing an adaptive resolution in an easier way. See, for example, [10] for few examples of adaptive techniques employing wavelets.
While orthogonality is considered an important feature in many theoretically wavelet papers, in the context of differential equation solution, it plays a smaller role. The reason is that basis functions enter in the scalar products associated with the various methods via the differential operator D, and it is not guaranteed that D will preserve orthogonality (that would give rise to many zero entries, that is, sparser matrices).
Actually, orthogonality is preserved if the two basis functions have disjoint support in space (since differential operators do not extend the support) or in frequency (since differential operators are translation-invariant and in frequency they become a product). This suggests that in the context of differential equations, compact support and well-localization in frequency are more important than just orthogonality. In a sense, they represent a "robust" orthogonality condition.
It is true that true compact support in frequency is less common than compact support in space. With the exception of few very special and theoretical cases (e.g., Sinc), the best we can get is a rapid decay in frequency. This means that the scalar product of two basis functions separated in frequency will be maybe very small, but not zero. Nevertheless, even this kind of "almost sparseness" can be exploited.
A general issue with wavelets is that it can be difficult to impose boundary conditions since they have no natural interpolation property that would make boundary condition handling simpler. Another problem that can arise is that many wavelets have no closed-form description, for example, Daubechies wavelets that are described as the result of the iteration of operator O in (39). This can make their application to PDE more difficult, for example, when computing scalar products involved in weighted residual and other methods.
Finally, another common issue is that most of the known wavelets are defined on a one-dimensional domain, while many physical systems are on a multidimensional domain. The easiest way to create a multidimensional multiresolution analysis by a one-dimensional one is the separable (or tensor product or Kronecker product) approach that create a multidimensional function by the product of several one-dimensional ones, e.g., This kind of approach, however, produces "cube-like" wavelets, and their application to FEM schemes based on triangular elements can be difficult.

Some schemes from the literature
In this section we briefly summarize some interesting wavelet-based schemes that can be found in the literature. As said above, wavelets and scaling functions can be used as basis in the approximation used in collocation, weighted residuals, and other methods.
Wavelets in Galerkin and weighted residual methods bring the advantage of their multiresolution and localization properties while, however, suffering from difficulties in handling complex boundary conditions. Moreover, nonlinear equations can turn out to be difficult to handle. Nevertheless, there have been many successful examples in the application to elliptic, hyperbolic, and parabolic PDE [11][12][13][14][15][16][17][18][19][20][21][22][23][24][25][26][27]. Wavelet-based collocation methods, where wavelet functions are used as shape functions, also registered some success. The advantage of collocation methods is that they are more easily applicable in nonlinear cases [28,29] and irregular boundary conditions [30]. A collocation method based on second-generation wavelet and lifting is applied to a nonlinear vibration problem in [30,31].
Much more popular seems to be the application of wavelets to FEM techniques. In this case wavelets or scaling functions are used as shape functions instead of the more traditional polynomials. Daubechies wavelets are particularly popular most probably because of their compact support property. Also of interest is the fact that Daubechies' wavelets can have any number of null moments, making possible the perfect interpolation of polynomials. Some examples of successful application Daubechies wavelets to PDE (mostly mechanical problems) are [32][33][34][35][36]. Of special interest is the proposal of Mitra [37] where wavelet-based FEM is used to transform a wave propagation problem into ordinary differential equations that are successively solved.
Another popular solution for wavelet-based FEM is the wavelets based on spline spaces. Although spline bases cannot have both compact support and orthogonality, in differential equations, as explained above, we gladly give up on orthogonality if we can get compact support and smoothness. Another important advantage of splines is that a simple closed-form expression is known. Examples of spline applications can be found in [38][39][40]. Of special interest is the application of Hermite cubic splines (HCS), a kind of multiwavelet [41] that shows promise in handling in a numerically robust way boundary conditions. The HCS is a multiwavelet with four smooth (twice differentiable) components defined on interval 0; 1 ½ . Some examples of application can be found in [42][43][44]. A problem with the application of wavelets to FEM techniques is the difficulty to adapt the wavelet construction to complex meshes. In this case the use of second-generation wavelet based on an extension of the lifting idea has attracted some attention [45][46][47].
Wavelets have attracted some interest also in the context of meshless methods [9,48,49]. Of some interest for the special problem is [9] that uses a wavelet approach to implement a meshless solver for differential equations defined on the sphere. A problem with applying wavelets to generic manifolds like a sphere is that it is not clear what a "rescaling by 2" should mean for a manifold that is not a Euclidean space. The idea used in [9] is to use a so-called diffusion wavelet where the dilation is replaced by a diffusion operator that looks like a kind of "low-pass filtering" that smear out the details; see [9] for the precise definition.

Conclusions
This chapter introduced the reader to the field of applying wavelets to the numerical solution of differential equations. Both wavelets and differential equations are research fields with many applications, contributions, and results. Their combination gives rise to wide varieties of methods, each one suited for specific applications. By looking at the literature, we can see that wavelets can be a very powerful tool for solving PDE especially because of their multiresolution nature that allows to optimize the level of detail where it is needed. Wavelets, however, are not a silver bullet for all problems either, since they can have some characteristics (multidimensional construction via tensor product, nonexistence of a closed-form expression, difficulty in handling some boundary conditions, etc.) that can make their application not trivial in some cases. We can say that this is a field where, more than ever, no single solution fits all and that every practitioner needs to find the solution specific for the problem at hand using knowledge in both fields and some ingenuity.

Author details
Riccardo Bernardini University of Udine, Udine, Italy *Address all correspondence to: riccardo.bernardini@uniud.it © 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.