The Wavelet Transform for Image Processing Applications

,


Introduction
In recent years, the wavelet transform emerged in the field of image/signal processing as an alternative to the well-known Fourier Transform (FT) and its related transforms, namely, the Discrete Cosine Transform (DCT) and the Discrete Sine Transform (DST).In the Fourier theory, a signal (an image is considered as a finite 2-D signal) is expressed as a sum, theoretically infinite, of sines and cosines, making the FT suitable for infinite and periodic signal analysis.For several years, the FT dominated the field of signal processing, however, if it succeeded well in providing the frequency information contained in the analysed signal; it failed to give any information about the occurrence time.This shortcoming, but not the only one, motivated the scientists to scrutinise the transform horizon for a "messiah" transform.The first step in this long research journey was to cut the signal of interest in several parts and then to analyse each part separately.The idea at a first glance seemed to be very promising since it allowed the extraction of time information and the localisation of different frequency components.This approach is known as the Short-Time Fourier Transform (STFT).The fundamental question, which arises here, is how to cut the signal?The best solution to this dilemma was of course to find a fully scalable modulated window in which no signal cutting is needed anymore.This goal was achieved successfully by the use of the wavelet transform.
Formally, the wavelet transform is defined by many authors as a mathematical technique in which a particular signal is analysed (or synthesised) in the time domain by using different versions of a dilated (or contracted) and translated (or shifted) basis function called the wavelet prototype or the mother wavelet.However, in reality, the wavelet transform found its essence and emerged from different disciplines and was not, as stated by Mallat, totally new to mathematicians working in harmonic analysis, or to computer vision researchers studying multiscale image processing (Mallat, 1989).

Continuous Wavelet Transform
Different ways to introduce the wavelet transform can be envisaged (Starck et al., 1998).However, the traditional method to achieve this goal remains the use of the Fourier theory (more precisely, STFT).The Fourier theory uses sine and cosine as basis functions to analyse a particular signal.Due to the infinite expansion of the basis functions, the FT is more appropriate for signals of the same nature, which generally are assumed to be periodic.Hence, the Fourier theory is purely a frequency domain approach, which means that a particular signal f(t) can be represented by the frequency spectrum F(w), as follows: The original signal can be recovered, under certain conditions, by the inverse Fourier Transform as follows: Obviously, discrete-time versions of both direct and inverse forms of the Fourier transform are possible.
Due to the non-locality and the time-independence of the basis functions in the Fourier analysis, as represented by the exponential factor of equation ( 1), the FT can only suit signals with "time-independent" statistical properties.In other words, the FT can only provide global information of a signal and fails in dealing with local patterns like discontinuities or sharp spikes (Graps, 1995).However, in many applications, the signal of concern is both time and frequency dependent, and as such, the Fourier theory is "incapable" of providing a global and complete analysis.The shortcomings of the Fourier transform, in addition to its failure to deal with non-periodic signals led to the adoption by the scientific community of a windowed version of this transform known as the STFT.The STFT transform of a signal f(t) is defined around a time  through the usage of a sliding window w (centred at time ) and a frequency  as (Wickerhauser, 1994;Graps, 1995;Burrus et al., 1998;David, 2002& Oppenheim & Schafer, 2010): As it is apparent from equation (3), even if the integral limits are infinite, the analysis is always limited to a portion of the signal, bounded by the limits [-,  ] of the sliding window.The time-frequency plane of a fixed window STFT transform is illustrated in Figure 1.Fig. 1.Fourier time-frequency plane (Graps, 1995) Although, this approach (using STFT transform) succeeds well in giving both time and frequency information about a portion of the signal, however, as its predecessor, it has a major drawback.The fact is that the choice of the window size is crucial.As stated by Starck and al (Starck et al., 1998): " The smaller the window size, the better the time-resolution.However, the smaller the window size also, the more the number of discrete frequencies which can be represented in the frequency domain will be reduced, and therefore the more weakened will be the discrimination potential among frequencies".This problem is closely linked to the Heisenberg's uncertainty principle, which states that a signal (e.g. a very short portion of the signal) cannot be represented as a point in the time-frequency domain.
This shortcoming brings us to rise the fundamental question: how to size then the sliding window?Not surprisingly, the answer to this question leads us by means of certain transformations to the wavelet transform.In fact, by considering the convolution of the sliding window with the time-dependant exponential e -jwt within the integral of equation ( 3 And replacing the frequency  by a scaling factor a, and the window bound  by a shifting factor b, leads us to the first step leading to the Continuous Wavelet Transform (CWT), as represented in equation ( 5): www.intechopen.com The combination of equation ( 5) with equation (3), leads to the CWT as defined by Morlet and Grossman (Grossman & Morlet, 1984).
Where f(t) belongs to the square integrable functions space, L 2 (R).In the same way, the inverse CWT can be defined as (Grossman & Morlet, 1984): The C ψ factor is needed for reconstruction purposes.In fact, the reconstruction is only possible if this factor is defined.This requirement is known as the admissibility condition.In a more general way, ψ(t) is replaced by (t), allowing a variety of choices, which can enhance certain features for some particular applications (Starck et al., 1998;Stromme, 1999& Hankerson et al., 2005).However, the CWT in the form defined by equation ( 6) is highly redundant, which makes its direct implementation of minor interest.The time-frequency plane of a wavelet transformation is illustrated in Figure 2. The differences with the STFT transform are visually clear.Although wavelet transforms are defined as a mathematical tool or technique, there is no consensus within the scientific community on a particular definition.This "embarrassment" has been stated by Sweldens as (Sweldens, 1996): "Giving that the wavelet field keeps growing, the definition of a wavelet continuously changes.Therefore it is impossible to rigorously define a wavelet".According to the same author, to call a particular function a wavelet system, it has to fulfil the three following properties:


Wavelets are building blocks for general functions: They are used to represent signals and more generally functions.In other words, a function is represented in the wavelet space by mean of infinite series of wavelets.


Wavelets have space -frequency localisation: Which means that most of the energy of a wavelet is confined in a finite interval and that the transform contains only frequencies from a certain frequency band.


Wavelets support fast and efficient transform algorithms: This requirement is needed when implementing the transform.Often wavelet transforms need O(n) operations, which means that the number of multiplications and additions follows linearly the length of the signal.This is a direct implication of the compactness property of the transform.However, more general wavelet transforms require O(nlog(n)) operations (e.g.undecimated wavelet).
To refine the wavelet definition, the three following characteristics have been added by Sweldens and Daubechies (Sweldens, 1996& Daubechies, 1992, 1993) as reported in (Burrus et al., 1998):


Ability of generating lower level coefficients from the higher level coefficients.This can be achieved through the use of tree-like structured chain of filters called Filter Banks.

Multiresolution
The multiresolution concept has been introduced first by Mallat (Mallat, 1989).It defines clearly the relationships between the QMF, pyramid algorithms and orthonormal wavelet bases through basically, the definition of a set of nested subspaces and a so-called scaling function.The strength of multiresolution lies in its ability to decompose a signal in finer and finer details.Most importantly, it allows the description of a signal in terms of timefrequency or time-scale analysis.

Nested subspaces
The basic requirement for multiresolution analysis is the existence of a set of approximation subspaces of L 2 (R) (square integrable function space) with different resolutions, as represented schematically for the three intermediate subspaces in Figure 3 and stated by equation ( 8): In such a way that, if . Which means that the subspace containing high resolution will automatically contains those of lower resolution.In a more general case, if . This implication is known as the scale invariance property.

Scaling function
The existence of a so-called scaling function (t) is primordial in order to benefit from the multiresolution concept.In this context, let us define the scaling function first and then define the wavelet function through it (Burrus et al., 1998).Let the scaling function be defined by the following equation: Which forms with its translates an orthonormal (The orthogonality is not necessary, since a non orthogonal basis (with the shift property) can always be orthogonalised (Sweldens, 1995)) basis of the space V 0 : This means that any function belonging to this space ( 0 f(t) V  ) can be expressed as a linear combination of a set of so-called expansion coefficients, with the scaling function and its consecutive translates (since k (t) are the basis functions): Where the expansion coefficients k c (or c(k) ) are calculated using the inner product: By simply scaling and translating, a two-dimensional scaling function is generated from the original scaling function defined in equation ( 9): Where a and b are, the scaling and the shifting factors as defined in equation ( 5), respectively.To ease the implementation of a wavelet system, the translation and the scaling factor have been adopted to be a factor of two.In fact (Graps, 1995): These values are adopted for the remaining of the chapter.Thus equation ( 13) can be rewritten as: Identically, the two-dimensional scaling function forms with its translates an orthonormal space over k: And as such any function f(t) of this space can be expressed as: Where the coefficients h(k) are the scaling function coefficients.The value 2 ensures that the norm of the scaling function is always equal to the unity.This equation is fundamental to the multiresolution theory and is called the multiresolution analysis equation.

Wavelet function
What has been done so far to define the scaling function, its translates and the corresponding spanned spaces, can also be applied in the same way to the so-called wavelet function.Let us suppose for this purpose that the subspace 01 VV  has an orthogonal complement 0 W , such as 1 V can be represented as a combination of 0 V and 0 W as follows: Where the complementary space 0 W is spanned also by an orthonormal basis: The function (t) is known as the mother wavelet, the wavelet prototype or the wavelet function.The same properties, which apply to the scaling function, are also applicable to the wavelet function.In other words, a function 0 W f(t) can be expressed as: Where, the expansion coefficients k d (or d(k) ) are calculated using the inner product: Likewise, since 1 0 V W  , (t)can also be expressed in terms of the scaling function (2t) of the higher space 1 Where g(k) are the wavelet coefficients.This leads to a dyadic decomposition as represented by the grid of Figure 5.The equation ( 19) can be generalised to an arbitrary number of subspaces, such as, 2 V is represented in terms of 1 V and 1 W , 3 V in terms of 2 V and 2 W , and so on.The whole decomposition process is illustrated in Figure 4.

Fig. 4. Space decomposition
More generally, a subspace j V is spanned by 1 j W  and 1 j V  .Thus, the (R) L 2 space can be decomposed as follows: .
The index j represents the depth or the level of decomposition, which is arbitrary in this case.As for the scaling function, a two-dimensional scaled and translated wavelet function is defined as: In such way that:

Series expansions and Discrete Wavelet Transforms
According to equation ( 24), a function f(t) belonging to the 2 L( R ) space can be expanded in series in terms of the scaling function spanning the space j V and the wavelet functions spanning the spaces jj 1j 2 01 W , W , W ,...., W , W ....


as follows: Where j,k (t) is defined by equation ( 15) and n.k (t) is defined by equation ( 25).In this case, the index j, which is arbitrary, represents the coarsest scale, while the remaining are the high resolution details.Equation ( 27) represents the wavelet expansion series of the function f(t) , which plays a major role when deriving a more practical form of the wavelet transform.
The coefficients in the wavelet expansion series j c(k)and n d( k ) (or k) c(j, and k) d(n, ) are the so-called discerete wavelet transform of the function f(t) .Since the basis functions are orthonormal, they can be calculated using equations (12 and 22), respectively.We will see later in this chapter that the orthonormality condition can be relaxed allowing the implementation of another important basis, namely, the biorthogonal basis.

Filter banks and wavelet implementations
In general, wavelet transform-based applications involve discrete coefficients instead of scaling and/or wavelet functions.For practical and computational reasons, discrete time filter banks are required.Such structures decompose a signal into a coarse representation along with added details.To achieve this representation, the relationship between the expansion coefficients at lower and higher scale levels need to be defined.This can be easily done by using a scaled and shifted version of equation ( 18) along with simple transformations as reported in (Burrus et al., 1998).This relation is defined by: Where Z n  and Z k  .The computation of such equations is achieved through the use of the well-established digital filtering theory.In particular, for finite length signals (which is the case for digital images), the use of a Finite Impulse Response filter (FIR) is the most appropriate choice.However, since equations (28 and 29) compute one output for each two consecutive inputs, a modification needs to be made.The basic operation required here, is derived from the multirate signal processing theory (Fliege, 1994;Hankerson et al., 2005;Cunha et al., 2006;Lu & Do, 2007;Nguyen & Oraintara, 2008& Brislawn, 2010).It simply consists of using a down-sampler or decimator by a factor of two.In practice, it consists of applying a pair of FIR filters; each followed by a decimator as illustrated by Figure 6:

Fig. 6. Analysis Filter Bank
The filter bank is defined as a combination of a low pass filter and high pass filter, both followed by a factor of two decimation (Strang & Nguyen, 1996).Thus, the decomposition is reduced to two basic operations from the digital signal processing theory: a filtering and a down sampling.
The structure in Figure 6 is generally used to implement Mallat's algorithm.To allow further level of decomposition, identical stages are cascaded leading to a multiresolution analysis.This analysis scheme is known as the Subband Coding structure (Burrus et al., 1998) and is illustrated in the following figure.To recover the original signal from the previously analysed one, a reversed version of the analysis filter bank of Figure 6 is required.This can be achieved by using two basic operations: a filtering and an up sampling or interpolating process.In multirate digital signal processing, appending a zero sample between two consecutive samples performs the up sampling.Thus, for each input sample, we get two output samples.A three-stage synthesis subband coding is illustrated in Figure 9.

Algorithms for Wavelet Transform computation
This section is concerned with a review of variety of algorithms dedicated to implement wavelet transforms.We focus on both 1-Dimensional and 2-Dimensional systems.

Burt's Pyramid
Dedicated initially to lossless image coding, the pyramid algorithm was first introduced by Burt (Burt & Adelson, 1983).Basically, it decomposes a signal in a low-resolution signal along with some higher resolution signals through a repetition of reduction and expansion processes.At each level, the reduced and expanded signal is compared with the original signal and the difference is stored.In the same time, the reduced signal is repeatedly decomposed by further using the reducer block in the chain.The analysis/synthesis process is shown in Figure 10.

Fig. 10. Pyramidal analysis and synthesis
The reduction block performs the two basic operations of a low pass filtering and decimating by a factor of 2. The expansion block up samples the signal first, then filters it through the use of a synthesis low pass filter.To reconstruct the original signal, the difference signal at each level is added to a previously expanded signal.Repeatedly, the resulting signal is expanded and added to the corresponding difference signal.The decomposition and the reconstruction processes for a 2-D signal, as in image processing, is achieved through the use of a 2-D filtering process.In this case, only 1/4 of the original signal is obtained at the output of the reducer (the decimation is performed twice).This scheme can be represented by the pyramidal structure of Figure 11.This type of decomposition makes this algorithm suitable for a progressive image transmission scheme.

Mallat's Pyramidal algorithm
Mallat's pyramid is a direct consequence of the multiresolution concept developed by the same author and presented in section 6.Up to date, it is the most widely used approachboth in software and hardware -for implementing the wavelet transform (Masud, 1999).Since the one-dimensional decomposition and reconstruction schemes have been already introduced in section 6, we will focus in this section on two-dimensional schemes, which are more suitable for image analysis and synthesis.The two-dimensional decomposition approach is based on the property of separation of the functions into arbitrary x and y directions.The first step is identical to the one-dimensional approach, however, instead of keeping the low-level resolution and processing the high level resolution, both are processed using two identical filter bank after a transposition of the incoming data.Thus, the image is scanned in both horizontal and vertical directions.This result in an average image (or subimage) and three detail images generated by the following 2-D scaling function (x) (y) y) (x,  and the vertical, the horizontal and the diagonal wavelet functions: , respectively.To recover the original image, the inverse process is applied.Figure 12 illustrates the analysis and synthesis stages built using three filter banks each.

Feauveau's non-dyadic structure
Based on Adelson's work (Adelson et al., 1987), this approach has been introduced by Feauveau (Feauveau, 1990).This decomposition is also known as Quincux.It differs from Mallat's two-dimensional approach by the fact that only the decimated output from the low pass filter is transposed and then processed through a "similar" filter bank.The result is a low resolution average image along with two different detail images from two different resolution levels.The fact is that the decomposition is not dyadic and the initial resolution of a factor of 2 is replaced by a 2 factor leading to an asymmetrical support.Figure 14 shows an analysis and synthesis stage of a Quincux structure.Due to the removal of the filter bank at the output of the high pass filter -as reported in (Starck et al., 1998) only a wavelet image is involved at each stage.Recently, this approach has been used in an image compression scheme and found to give often better overall performances than other approaches (Stromme, 1999;Ebrahimi et al, 2002;Smith, 2003;Hankerson et al., 2005;Xiong & Ramchandran, 2005;Nai-Xiang et al., 2006;Raviraj & Sanavullah, 2007& Oppenheim & Schafer, 2010).The frequency bands of a Quincux analysis is shown in Figure 15.

Swelden's lifting scheme
Unlike the three previous methodologies, the lifting scheme follows another philosophy.
The fact is that the Fourier theory is not involved anymore and the construction of any wavelet system lies only in the spatial domain.If the explanation of the theory relies on the works of Sweldens (Sweldens, 1995, 1996& Valens, 2004) the lifting approach has links with many other schemes (Burrus et al., 1998;Do & Vetterli, 2003, 2005;Cunha et al., 2006;Lu & Do, 2007;Nguyen & Oraintara, 2008& Brislawn, 2010).The lifting-based wavelet transform can be seen as a succession of three operations: split, predict and update.In the first operation, data is the split into even and odds parts (known also as the lazy wavelet transform).Then, differences or details are calculated through the usage of a predictor.Finally, to compute the average, the even part is updated using the details previously calculated.Figure 16 shows an analysis and synthesis lifting-based wavelet transform.The reconstruction operation does exactly the same, but using the reverse process.The data is first predicted, then updated and finally merged.Figure 17 illustrates split and merge operations using the polyphase property (Fliege, 1994).

The Wavelet Transform revisited
In many practical problems, both the orthonormal basis (Daubechies, 1988(Daubechies, , 1992(Daubechies, , 1993) and the biorthogonal basis (Cody, 1994) can be used.The two bases (or families) present similarities and differences.Another scheme, called wavelet packet, which involves either orthonormal basis or biorthogonal basis is also possible (Wickerhauser, 1994).The following briefly describes the main features of orthonormal and biorthogonal bases together with extension to the wavelet packet scheme.It is worth mentioning that other schemes like undecimated wavelet, adaptive wavelets and multiwavelets exist and are beyond the scope of this brief overview.

Orthonormal basis
The orthonormal basis emerged from the work initiated by Mallat and Daubechies (Mallat, 1989& Daubechies, 1988, 1993).The orthonormality property is somewhat seen as the discrete version of the orthogonality property (Masud, 1999).However, the basis functions are further normalised.These concepts have been mentioned when the multiresolution feature and the scaling function have been introduced.The admissibility and the orthogonality conditions ensure the existence and the orthogonality feature of the scaling function, defined by equation ( 18).This is achieved if: Furthermore, using the two equations above alongside with equation ( 23), which defines the wavelet function, the orthogonality of the scaling function and the wavelet function at the same scale can be derived.This can be achieved only if the following equality is verified: The orthogonality between the wavelet coefficients and the scaling coefficients is then only a simple implication: The scaling coefficients, which satisfy equation ( 33), are called Quadrature Mirror Filters (QMF).
To achieve perfect reconstruction, the analysed signal has to be identical to the synthesised one.In other words,

Biorthogonal basis
Biorthogonal wavelet basis can be seen as a generalisation of the orthogonal wavelet basis where some imposed restrictions on the latter have been relaxed.Unlike the case of orthogonal basis, the scaling and the wavelet functions need be neither of the same length, nor even numbered.Hence, the quadrature mirror property is not applicable and is replaced with a dual property.For the perfect reconstruction equation to hold, the scaling and the wavelet coefficients have to fulfil the following equations: It is clear that when the analysis and the synthesis filters are similar, the system becomes orthogonal.The "orthogonality" condition in this case is defined by: Previously, in orthogonal basis, only the analysis scaling coefficients (or wavelet coefficients) along with their shifted versions were used.In biorthogonal case, the analysing scaling coefficients are kept unchanged, while their shifted versions are replaced by the shifted versions of the synthesis dual filter.In other words, the analysis filter is orthogonal to its synthesis dual filter.The biorthogonal denomination comes from this feature.
At the expense of the energy partitioning property stated by Perseval's equality, which is a direct consequence of the lack of orthogonality, a greater flexibility can be achieved by using the basis and dual basis (Burrus et al., 1998).One of the most "important" features in the biorthogonal basis is the linear phase property, which leads to the filter coefficients (when implementing a wavelet system) being symmetric.In addition, the difference of length between dual filters must be even, leading either to odd or even length of the low pass and the high pass filters.In general, biorthogonal wavelet systems present the following features (Daubechies, 1992):


The coefficients of the filters are either real numbers or integers;  The filters in this family present either even or odd orders;  The low pass and the high pass filters used in the filter bank have not the same length;  The low pass filter is always symmetric;  The high pass filter is either symmetric or antisymmetric.

Wavelet packets
In contrast to the "traditional" Mallat's decomposition, which leads to narrow frequency bandwidths (low frequencies) and wide frequency bandwidths (high frequencies), the wavelet packet approach emerged first as a way of adjusting high frequency resolutions.Hence, the Mallat's decomposition scheme is applied to both parts of a filter bank leading to the split of frequencies in progressive finer resolutions.The generic structure of wavelet packet decomposition is shown in Figure 19 and the frequency bandwidths illustrated by In comparison to classical wavelet approach, the wavelet packet scheme presents the following features (Daubechies, 1992):  Possibility of using different wavelet from a level to another.This strategy has been used in (Masud, 1999) to implement a two-level orthonormal wavelet packet and a three-level biorthogonal wavelet packet.


Possibility of choosing a particular wavelet packet decomposition from the general generic structure of Figure 19.Thus, one can choose either to preserve the orthonormality feature of the decomposition (Wickerhauser, 1994), or highlight the peculiarities of the signal (Masud, 1999).A binary search for the best decomposition tree is also possible (Burrus et al., 1998).
However, there is a cost to be paid.In this case, the computational complexity of a wavelet packet structure is O(nlog(n)) in contrast to the O(n) of the classical wavelet transform.

Image compression
Even though the wavelet transforms have been widely used in image coding since the late 80s, they only gained their notoriety in the field by the adoption of the first wavelet-based compression standard scheme, known as the FBI fingerprint compression standard Bradley, et al., 1993).Recently, what did Sweldens state in (Sweldens, 1996) as a need of standardising a wavelet-based compression scheme under the header "problems not sufficiently explored with wavelets", has seen the day, by the adoption of the JPEG2000 new compression standard (Ebrahimi et al., 2002).The block diagram of the JPEG2000 standard does not really differ from the JPEG standard one.The discrete wavelet transform, which replaces the DCT, is applied first to the source image.The transformed coefficients are then quantised.Finally, the output coefficients from the quantiser are encoded (using either Huffman coding or arithmetic coding techniques) to generate the compressed image (Smith, 2003;Do & Vetterli, 2005;Hankerson et al., 2005;Xiong & Ramchandran, 2005;Chappelier & Guillemot, 2006;Nai-Xiang et al., 2006;Raviraj & Sanavullah, 2007;Mallat, 2009;Oppenheim & Schafer, 2010).To recover the original image the inverse process is applied.Figure 21 shows the basic JPEG2000 Encoding Scheme (Ebrahimi et al., 2002).

Image denoising
Image manipulation, includes a wide range of operations like digitising, copying, transmitting, displaying … etc.Unfortunately, such manipulations generally degrade the image quality by spanning many types of noise.Hence, to recover the original structure of the image, the undesired added noise needs to be localised and then removed.In image processing, noise removal is achieved through the usage of filtering-based denoising techniques (Nai-Xiang & Yap-Peng, 2005;Chappelier & Guillemot, 2006;Firoiu et al., 2009;Mallat, 2009;Nafornita et al., 2009;Ruikar & Doye, 2010;Oppenheim & Schafer, 2010& Chen & Qian, 2011).Traditionally, image denoising or image enhancement is performed using either linear filtering or non-linear filtering.Linear filtering is achieved either by using spatial techniques, as low pass filtering, or frequency techniques, as the Fast Fourier Transform (FFT).On the other hand, statistical and morphological filters are typical examples of non-linear filtering.However, the filtering techniques lead in some cases to baneful effects when applied indiscriminately to an image.In fact, if it is not the whole image that is blurred, some of its important features (e.g.edges) are.
A solution to overcome this problem has been introduced by Denoho and Johnstone (Donoho & Johnstone, 1994).Instead of exploiting either linear or non-linear filtering, their technique consists of using the DWT followed by a thresholding operation.This method exploits the energy compaction ability of the wavelet transform to separate the image from the added noise.The role of the threshold is to eliminate the noise present in the image.Finally, the enhanced "denoised" image is recovered by applying the inverse DWT.This method is also known as the wavelet shrinkage denoising, and is classified as a nonlinear processing technique due to the thresholding operation involved in the process as illustrated in Figure 22.Another method, which achieves better performances when compared to the previous one, consists of using an undecimated version of the DWT (Donoho & Johnstone, 1995) This choice is motivated by the fact that originally, the DWT is not a shift-invariant transform, and as such, visual artifacts can be spanned by the transform.This like-noise is more apparent around discontinuities in the image.However, in this particular case the inverse transform is not unique.As a solution, it is appropriate to take the average of the possible reconstruction.The computational complexity of this approach is O(nlog(n)).

Image watermarking
Image watermarking emerged in the mid 90s as a discipline, among the wide range of multidisciplinary field of data hiding, as a methodology of protecting digital images from any piracy act.It consists of embedding a watermark (a trace) within a digital image before using or publishing it.The efficiency of a watermarking method lies generally in its ability to fulfil three requirements: robustness, security and invisibility.
Watermarking techniques can be classified into two categories; spatial domain methods and transform-based methods.The wavelet-based watermarking technique falls into the latter.
In (Kundur & Dimitrios, 1997, 1998& Hernandez-Guzman et al., 2008) both the original image and the watermark are first transformed to the wavelet domain, then the resulting image pyramids are fused according to certain rules, which take into account the characteristics of the Human Visual System (HVS).The wavelet in this case facilitates a simultaneous spatial localisation and frequency spread of the watermark within the source image.It has been shown that the method is robust under compression, additive noise and filtering (Kundur & Dimitrios, 1997, 1998) To the best of our knowledge, there is no general baseline framework for a wavelet-based watermarking system.However, in most cases, the multiresolution feature of the transform is exploited to achieve robust image watermarking implementations (Kundur & Dimitrios, 1997, 1998;Tsekeridou & Pitas, 2000;Wu et al., 2000& Hernandez-Guzman et al., 2008).

FPGA implementation
Quick time-to-market, low cost and high performance are typically the treble that digital system designers wish to achieve when developing new products.Although, each goal taken individually is possible, the set of three is generally beyond the capabilities of traditional design and implementation approaches (Villasenor et al., 1995;Villasenor & Mangione-Smith, 1997;Barr, 1998;Ritter & Molitor, 2000;Chrysafis & Ortega, 2000;Lafruit et al., 2000;Russel & Wayne, 2001;Ebrahimi et al., 2002;Nibouche, et al., 2000Nibouche, et al., , 2001aNibouche, et al., , 2001bNibouche, et al., , 2001cNibouche, et al., , 2001dNibouche, et al., , 2002Nibouche, et al., , 2003;;Katona et al., 2006;Angelopoulou et al., 2008& Lande et al., 2010).Versatile hardware such as general purpose processors (GPP), for example, can perform a wide range of operations and tasks, but fails to reach the system speed of a more specialised hardware.On the contrary, an oriented application-specific hardware, such as Application Specific Integrated Circuits (ASICs), can perform a restricted set of operations/tasks more quickly, however, at the cost of losing in generality.Hence, reconfigurable computing, generally in the form of Field Programmable Gate Arrays (FPGAs), appears to be the promising land for hardware designers.This is old/new paradigm allies the flexibility of software while preserving the hardware performances.This leads to a good trade-off between speed and generality.Unlike the case of custom hardware in the form of ASICs, which cannot be reused for a slightly different problem to the one they were designed for, configurable hardware based FPGAs allows modifications at almost any stage of the design process.In fact, configurable hardware is easily upgraded (due to its inherent nature) to suit any changes of a primal design.Used in a desktop, reconfigurable hardware can be tailored to speed up or accelerate applications, which require a system speed superior to that offered by general purpose processors.The hardware here needs to adapt itself to continual changes in response to end users needs.Obviously, the reconfigurable capabilities of such hardware will not eliminate the need for general-purpose microprocessors running on today's Personal Computers (PCs).In fact, "FPGAs will never replace microprocessors for general-purpose computing tasks", as stated by Villasenor J. and Mangione-Smith W. in (Villasenor & Mangione-Smith, 1997).
The idea of reconfigurable computing was introduced first at the late 60s at the University of California at Los Angeles (UCLA) (Villasenor & Mangione-Smith, 1997& Barr, 1998).
However, the real emergence of this new paradigm for hardware computation was piloted by the commercialisation of the first SRAM-based FPGA by Xilinx Corporation in 1986(Russel & Wayne, 2001).The first configurable devices from both Xilinx Corporation and Altera Corporation, composed typically of a fine grained structure, allowed a system speed in the range of 2MHz -5MHz and a chip area of less than a 100 of logic blocks (Russel & Wayne, 2001).The efforts deployed by academicians and industrials since then brought to light new developments but also new challenges.In fact, the reconfigurable hardware field www.intechopen.comhas dramatically maturated either by the developments in the microelectronic technology, which led to the emergence of a new range of devices providing a system gate beyond a million (e.g.Xilinx Virtex family) or by the continual emergence of a wide range of FPGA based system.
In general, FPGA devices are organised as 2D arrays of configurable logic blocks or logic elements.The parallel nature of FPGA devices make them very good targets for application that require parallel processing such as in image and video processing.In such applications, these FPGA devices are used either as co-processors or accelerators (real time applications).It is not the aim of this section to survey the field of wavelet based FPGA implementation but rather to highlight some implementation of the DWT for application in the field of image/video processing (in line with section 9).
Due to its high computational complexity, real time video compression has always been a very challenging topic for digital system designers.The implementation of such systems on FPGAs does not fail to the rule.In probably one of the earliest works in the field, Villasenor et al. in (Villasenor et al., 1995) investigated wavelet transforms based video compression algorithms for use in low-power wireless communications.Using this previous work as a basis, the same authors have further described two implementations using a single FPGA (Schoner et al., 1995).In the first approach, the proposed video compression scheme is directed towards low-complexity implementations using a single in system reprogrammable FPGA.The optimisation of the algorithm to fit the system results in an efficient implementation, however, the system is limited to only a single compression algorithm.In the second approach, to allow more flexibility, the FPGA chip is combined with an external special purpose Video Signal Processor (VSP).The FPGA/VSP combination allows the implementation of four common compression algorithms and their execution in real time.
The proposed design schemes were both implemented on a Xilinx FPGA.The first design runs at 20 frames per seconds (fps) when processing a 256x256 frames with a spacial precision of 8-bits.It includes a wavelet transform, a simplified quantiser and a run-length encoder.The second scheme is capable of implementing a DCT, a 2-D FIR, a Vector Quantisation scheme (VQ) and the wavelet transform using a single generic equation.It delivers different performances: 13.3 fps for 7x7 mask 2-D filter, 55 fps for an 8x8 block DCT, 7.4 fps for a 4x4 VQ (at 1/2 bit per pixel) and 35.7 fps for a single wavelet stage.
Partitionning images prior to computation is a well known technique in the field of image processing.It has been widely used in DCT-based image compression schemes.In the last decade, this technique has been adopted in the wavelet-based JPEG2000 new compression standard (Ebrahimi et al., 2002).In (Ritter & Molitor, 2000), a biorthogonal Cohendaubechies-Fauveau (CDF) 5/3 wavelet pair followed by Embedded Zerotree Encoding (EZT) technique is used in a lossy and a lossless compression schemes, respectively.Since the 5/3 pair is an integer-to-integer wavelet, a lifting scheme based architecture is used for the implementation.In the lossless compression scheme, the image is partitioned into a set of 32x32 tiles before processing.The system is then implemented onto an FPGA prototyping board.The system achieved an operating speed of 20MHz.In the second scheme, in order to avoid excessive increase of the internal memory, a rearrangement of the filtered and decimated outputs is proposed (interlocked external memory access.Because of its integer nature (integer to integer), as well as, for its adoption in the JPEG 2000 standard, the biorthogonal 5/3 wavelet is the focus of many studies.Since the wavelet transform algorithms are inherently multi levels, requiring complex computation schedule in hardware, a comparison of different computation schedule algorithms is presented in (Angelopoulou et al., 2008).The most widely used schedule algorithms such as the row column based algorithm (Mallat, 1989), the line based algorithm (Chrysafis & Ortega, 2000) and the block based algorithm (Lafruit et al., 2000) are implemented in FPGA using the lifting scheme and 2D DWT architecture.The 2D DWT FPGA implementation is fully parameterised.Based on the lifting scheme, Lande et al. in (Lande et al., 2010) introduce a robust invisible watermarking method to be used with still images.The scheme is incorporated in the JPEG 2000 lossless algorithm, featuring an integer to integer biorthogonal 5/3 CDF wavelet filters.The proposed algorithm targets the consumer electronics market.The objectives of the proposed FPGA implementation of this wavelet based watermarking scheme include low power usage, real time performance, robustness and ease of integration.
Denoising still images and video sequences is another field of predilection of the wavelet transform (see section 9).Katona et al. (Katona et al., 2006) suggest a real time wavelet based video denoising system and its implementation in FPGA.The method adopts a parallel approach to implement an advanced wavelet domain noise filtering algorithm, which uses a non-decimated wavelet transform.The approach relies on the wavelet "a trous" algorithm and the Daubechies minimum phase wavelet (Daub4).The proposed implementation is decentralised and distributed over two FPGAs.As a proof of concept, digitised television signals are adopted as real time video sources.

Conclusion
Since the late 80s, the wavelet transform has been widely used in different scientific applications including signal and image processing.This ongoing growing success, which has been characterised by the adoption of some wavelet-based schemes, is due to features inherent to the transform, such as time-scale localisation and multiresolution capabilities.In this chapter, the basic concepts of the wavelet transform have been introduced.First, the historical development of the wavelet transform and its advent to the field of signal and image processing were reviewed.Then, its features and the mathematical foundations behind it were reviewed.To ease the understanding of the wavelet theory, the related notations and terms, such as the scaling function, multiresolution, filter bank and others were described and then briefly explained.
Depending on the application at hand, different algorithms for implementing the wavelet transform have been developed.Four of these algorithms, namely, Burt's pyramid, Mallat algorithm, Feauveau's scheme and the lifting scheme were briefly described.Finally, some wavelet based image processing applications were also given.

References
Adelson Implementation and comparison of the 5/3 lifting 2D discrete wavelet transform

Fig. 2 .
Fig. 2. Wavelet time-frequency plane ((Graps, 1995) with minor modifications) At this stage and after this brief introduction, it is natural to ask the question: therefore what are wavelet Transforms?

Fig. 12 .
Fig. 12. Two-dimensional Mallat's analysis and synthesis treeIn this case, the frequency band is halved at each stage by a factor of four as represented by Figure13.

Figure 20 ..
Figure 20.In this scheme, the number of filters increases by a factor of ) 2 (2 j i  at each successive subband, where i and j represent two consecutive resolutions and 1 j i   .
Fig. 20.Two-band analysis and synthesis filter bank Fig. 21.Wavelet-based encoding scheme