1. Introduction
With the growing demand for multimedia applications, especially high-definition images, efficient storage and transmission of images have been issues of great concern [1, 2, 3, 4]. Image processing deals with the reduction of the amount of bits used to represent an image. Not only that but also resolution of an image plays an important role in many image processing applications, such as video resolution enhancement [5], feature extraction [6], and satellite image resolution enhancement [7]. In general, there are two types of super resolution approaches, multi-image super resolution and single image. Multiple-image super-resolution algorithms, like [8], [9], [10] to name a few, receive a couple of low-resolution images of the same scene as input and usually employ a registration algorithm to find the transformation between them. This transformation information is then used along with the estimated blurring parameters of the input low-resolution images, to combine them into a higher-scale framework to produce a super-resolved output image. For multiple-image super-resolution algorithms to work properly there should be subpixels displacements between input low-resolution images. Furthermore, these subpixels displacements should be estimated properly by the registration algorithm, which is usually a challenging task, especially when complicated motion of nonrigid objects, like human body, needs to be modeled. These algorithms are guaranteed to produce proper higher-resolution details; however, their improvement factors are usually limited by factors close to 2 [11].
Single-image super-resolution algorithms, like [12, 13, 14], to name a few, do not have the possibility of utilizing subpixel displacements, because they only have a single input. Instead, they employ a kind of training algorithm to learn the relationship between a set of high-resolution images and their low-resolution counterparts. This learned relationship is then used to predict the missing high-resolution details of the input low-resolution images. Depending on the relationship between the training low- and high-resolution images, these algorithms can produce high-resolution images that are far better than their inputs, by improvement factors that are much larger than 2 [15]. Hence, compression of an image and yet reconstructing the image with good resolution is important. Information theory is playing an important role in image compression. Information theory can be used in order to reduce the dimensionality of data such as histogram [16, 17]
There are two categories of image compression techniques, namely lossless and lossy image compression techniques [18, 19]. In lossless image compression, the original image can be perfectly recovered from the compressed image while in lossy compression the original image cannot be perfectively recovered from the compressed image because some information is lost as a result of compression. Lossless compression is used in applications with high requirements such as medical imaging. Lossy compression techniques are very popular because they offer higher compression ratio. The objective of image compression is to achieve as much compression as possible with little loss of information [20, 21].
Wavelets are also playing significant role in many image processing applications [12, 22, 23, 24]. The two-dimensional wavelet decomposition of an image is performed by applying the one-dimensional DWT along the rows of the image first, and then the results are decomposed along the columns. This operation results in four decomposed subband images referred to Low-Low (LL), Low-High (LH), High-Low (HL), and High-High (HH). The frequency components of those subbands cover the full frequency spectrum of the original image. Figure 1 shows different subband images of Lena image where the top-left image is the LL subband and the bottom-right image is the HH subband.
In this research work, a new lossy compression technique which employs singular value decomposition (SVD) and wavelet difference reduction (WDR) is presented. SVD is a lossy image compression technique which can be regarded as a quantization process where it reduces the physcovisual redundancies of the image [25, 26]. In order to enhance the resolution of the decompressed image, stationary wavelet transform (SWT) is used. WDR is one of the state-of-the-art techniques in image compression which uses wavelet transform. It is a lossy image compression technique which achieves compression by first taking the wavelet transform of the input image and then applying the difference reduction method on the transform values [27, 28, 29, 30].
Wavelet transform based techniques also play a significant role in many image processing applications, in particular in resolution enhancement, and recently, many novel resolution enhancement by using wavelet transforms have been proposed. Demirel and Anbarjafari [31] proposed an image resolution enhancement technique based on the input image and interpolation of the high-frequency subband images obtained by DWT. In their technique, an SWT technique is used in order to enhance the edges. Then, at the same time input image is decomposed into four frequency subbands image by using DWT. After that the input image, as well as the high-frequency subbands are interpolated. The high-frequency subbands of SWT are used to modify the estimated high-frequency subbands. Finally, inverse DWT (IDWT) is applied to combine all frequency subbands in order to generate a high-resolution image. Figure 2 shows the block diagram of the proposed method in [31].
The authors in [32] proposed a learning-based super-resolution algorithm. In their proposed algorithm, a multi-resolution wavelet approach was adopted to perform the synthesis of local high-frequency features. Two frequency subbands, LH and HL, were estimated based on wavelet frame in order to get a high-resolution image. The LH and HL frequency subbands were used to prepare their training sets. Then, they used the training set in order to estimate wavelet coefficients for both LH and HL frequency subbands. Finally, the IDWT was used in order to reconstruct a high-resolution image.
In [33], the authors used a complex wavelet-domain image resolution enhancement algorithm based on the estimation of wavelet coefficients. Their method uses a dual-tree complex wavelet transform (DT-CWT) in order to generate a high-resolution image. First, they estimate a set of wavelet coefficients from the DT-CWT decomposition of the rough estimation of the high-resolution image. Then, the inverse DT-CWT is used to combine the wavelet coefficients and the low-resolution input image in order to reconstruct a high-resolution image. Figure 3 shows the block diagram of the proposed method in [33].
Patel and Joshi [34] proposed a new learning-based approach for super resolution using DWT. The novelty of their method lies in designing application-specific wavelet basis (filter coefficients). First the filter coefficients and learning the high-frequency details in the wavelet domain is used to initial estimate of super-resolution image. Then, they used a sparsely based regularization framework, in which image there was degradation. Finally, the super-resolution image is estimated by the initial super-resolution estimate and the estimated wavelet filter coefficients. Their algorithm has some advantages such as avoiding the use of registered images while learning the initial estimate, use of sparsity prior to preserving neighborhood dependencies in super-resolution image and use of estimated wavelet filter coefficients to represent an optimal point spread function to model image acquisition process. Figure 4 illustrates the block diagram of the proposed method in [34].
In [35], similar to the proposed method in [30], the authors used wavelet domain in order to generate super-resolution image from a single low-resolution image. They proposed an intermediate stage with the aim of estimating high-frequency subbands. The intermediate stage consists of an edge preservation procedure and mutual interpolation between the input low-resolution image and the HF subband images. Sparse mixing weights are calculated over blocks of coefficients in an image, which provides a sparse signal representation in the low-resolution image. Finally, they used IDWT to combine all frequency subbands in order to reconstruct a high-resolution image. The block diagram of their proposed method is shown in Fig. 5.
In [36], they proposed a learning-based approach for super-resolving an image captured at low spatial resolution. They used a low resolution and a database of high- and low-resolution images as inputs to the proposed method. First, they used DWT in order to obtain high-frequency details of database images. Then, an initial high-resolution image was decimated by using the high-frequency details. In their observation model, they modelled a low-resolution image as an aliased and noisy version of the corresponding high-resolution image and then the initial high-resolution and test image estimated the aliasing matrix entries. After that, the prior model for the super-resolved image was chosen as an Inhomogeneous Gaussian Markov random field (IGMRF) and the model parameters were estimated using the same initial high-resolution estimate. They used a maximum a posteriori (MAP) estimation in order to arrive at the cost function minimized using a simple gradient descent approach. Figure 6 shows the block diagram of the proposed method in [36].
In the proposed compression technique, the input image is firstly decomposed into its different frequency subbands by using 1 level DWT. The LL subband is then being compressed by using DWR and the high-frequency subbands, i.e., LH, HL, and HH, are being compressed by using SVD. The proposed technique has been tested on several well-known images such as, Lena, Peppers, Boat, and Airfield. The results of this technique have been compared with those of JPEG2000 and WDR with arithmetic coding techniques. The quantitative experimental results based on PSNR show that the proposed technique overcomes the aforementioned techniques. The SVD and WDR image compression techniques are discussed in the next section.
2. Review of singular value decomposition and wavelet difference reduction
2.1. Singular value decomposition
From a mathematical point of view, an image can be represented by a matrix, which consists of one or three layers in the case the image is grayscale or RGB, respectively. The results of the implementation of SVD on a grayscale image, which is represented by the single-layer image A, are three matrices U, Σ, and V, where U and V are orthogonal, and Σ is a diagonal matrix containing the singular values of A. In what follows, the SVD procedure is briefly reviewed. The relation between the matrix A, and the decomposed components, U, Σ, and V, can be mathematically presented through the formulation provided in Eqn. (1), where the dimensions of all the matrices are shown, given that the dimensions of the matrix A has been m×n [37, 38, 39]:
Eqn. (2) shows how a matrix
Some columns of U and rows of V are then reduced in order to reconstruct the compressed image by multiplication. This is shown in Eqn. (3):
The compressed image is then obtained as shown in Eqn. (4):
Because the singular matrix has sorted singular values (in descending order), by using the physcovisual concept, ignoring low singular value will not significantly reduce the visual quality of the image. Figure 7 shows Lena’s picture being reconstructed by using different amount of singular values. This characteristic that an image can be reconstructed by fewer amounts of singular values makes SVD suitable for compression. Because after reconstruction of the image the ignored singular values cannot be recovered, the compression by SVD is lossy [33].
2.2. Wavelet difference reduction
The WDR is a compression technique, which is based on the difference reduction method. The wavelet transform of the input image is first made; bit plane encoding is then applied to the transform values. The bit plane encoding procedure starts with the initialization stage, where a threshold T_{o} is chosen such that T_{o} is greater than all the transform values, and at least, one of the transform values has a magnitude of T_{o}/2. The next stage is the initialization, where the threshold T = T_{k-1} is updated to T = Tk, where T_{k} = T_{k-1/2}. New significant transform values (w(i)) which are satisfying T ≤ | w(i) | ≤ 2T are then identified at the significant pass stage. The transform values of these significant transform values are then encoded using the difference reduction method. At the significant pass stage, already quantized values (w_{Q}) which satisfy |w_{Q}| ≥ 2T are then refined in order to reduce error [27, 29, 30].
3. The proposed lossy image compression technique
The proposed image compression technique is a lossy compression technique. Firstly, the image is decomposed into its frequency subbands by using DWT. Among these subbands, LL subband is being compressed by using WDR. The high-frequency subband images are being compressed by using SVD. The number of singular values that are being used in order to reconstruct the high-frequency subbands can be reduced into 1, i.e., the highest singular value is enough to reconstruct the high-frequency subbands. If only one singular value is being used in order to reconstruct a matrix, this means that only one column of U and V matrices are being used. The qualitative loss is not psychovisually noticeable up to some point. In order to obtain the compression ratio of the proposed technique, the total number of bits required to represent the original image is divided by the total of number of bits which is obtained by adding the number of bit streams of WDR for LL and that of the SVD compression for LH, HL, and HH.
Decompression is carried out by taking the inverse WDR (IWDR) of the bit streams in order to reconstruct the LL subband and in parallel the matrix multiplications are conducted in order to reconstruct LH, HL, and HH subbands. Due to the losses by ignoring low-valued singular values, high-frequency subbands need to be enhanced. For this purpose, stationary wavelet transform (SWT) is applied to the LL subband image which results in new low- and high-frequency subbands. These high-frequency subbands will have the same direction as the ones obtained by DWT (e.g., horizontal, vertical, and diagonal), so they will be added to the respective ones reconstructed by matrix multiplications. Now, the LL subband image obtained by IWDR and the enhanced LH, HL, and HH subbands are combined by using inverse DWT (IDWT) in order to reconstruct the decompressed image. The enhancement of high-frequency subbands by using SWT results in more sharpened decompressed image. The block diagram of the proposed lossy image compression technique is shown in Fig. 8. The experimental qualitative and quantitative results are represented and discussed in the next section.
4. Experimental results and discussion
As it was mentioned in the Introduction, for comparison purposes, the proposed lossy image compression was tested on many benchmark images, namely, Lena, Pepper, Boats, Airfield, and Goldhill. All the input images were of resolutions 256 x 256 pixels, 8-bit grayscale. Tables 1, 2, and 3 provide a quantitative comparison between the proposed technique, JPEG2000, and WDR [40, 41] based on PSNR values, in dB, for compression ratios 20:1, 40:1, and 80:1, respectively.
The foregoing tables illustrate the superiority of the proposed method in terms of its capability in leading to significantly higher PSNR values compared to the other techniques proposed, previously, in the literature. It is worth noticing that the improvement in the PSNR values brought about by considering the proposed method might better show its impact while keeping in mind the fact that they are calculated in dB, meaning that a logarithmic function determines them, which clarifies how considerable the difference between the actual values has been. To be more clear, if one calculates the difference between the PSNR values obtained using WDR and JPEG2000, and subsequently, that of JPEG2000 and the proposed method, it can be seen that the latter is much higher than the former, although JPEG2000 has always been deemed of significantly better performance than WDR. Thus, it can be concluded that the proposed method makes an enormous enhancement to the PSNR values compared to the ones obtained upon employing WDR or JPEG2000.
Image | Techniques | ||
WDR | JPEG2000 | Proposed | |
Lena | 35.72 | 35.99 | 39.14 |
Pepper | 34.21 | 35.07 | 40.07 |
Boats | 32.42 | 33.18 | 35.97 |
Airfield | 27.02 | 27.32 | 31.43 |
Goldhill | 31.76 | 32.18 | 38.05 |
Image | Techniques | ||
WDR | JPEG2000 | Proposed | |
Lena | 32.44 | 32.75 | 35.98 |
Pepper | 31.67 | 32.40 | 36.45 |
Boats | 29.32 | 29.76 | 32.03 |
Airfield | 24.72 | 24.88 | 29.62 |
Goldhill | 29.43 | 29.72 | 34.19 |
Image | Techniques | ||
WDR | JPEG2000 | Proposed | |
Lena | 29.71 | 29.62 | 32.46 |
Pepper | 28.93 | 29.54 | 33.07 |
Boats | 26.96 | 26.76 | 30.19 |
Airfield | 22.71 | 22.64 | 27.32 |
Goldhill | 27.72 | 27.69 | 32.64 |
In order to ensure the quality of the output of the proposed technique, and for visual illustration, the images resulting from the implementation of the foregoing approach were obtained, along with that of JPEG2000 and WDR. Figure 9 shows a part of the magnified Lena image having been compressed using the foregoing approaches, separately, with compression ratio 40:1. As sought from the outset, the proposed method is competent enough to maintain the quality of the image while compressing it, and at the same time, result in better PSNR, which shows its capability in correctly deciding on a reasonable trade-off between the amount of data needed to be transferred, or kept, and the visibility and authenticity of the details in the image blocks, which is, probably, the most tricky criterion in devising image compression algorithms. As Fig. 9 illustrates, the overall quality of the Lena image being compressed by the proposed method is satisfactory despite possessing much higher PSNR value compared to the JPEG2000 and WDR techniques, and the details are clear and visible, even better than the output of the WDR.
5. Conclusion
In this research work, a new lossy image compression technique which uses singular value decomposition and wavelet difference reduction techniques, followed by resolution enhancement, using discrete wavelet transform and stationary wavelet transform was proposed.
As the first step in the proposed image compression technique, the input image was decomposed into four different frequency subbands using discrete wavelet transform. The low-frequency subband was compressed using wavelet difference reduction, and in parallel, the high-frequency subbands were compressed using singular value decomposition. The compression ratio was obtained by dividing the total number of bits required to represent the input image over the total bit numbers obtained by wavelet difference reduction and singular value decomposition.
Reconstruction was carried out using inverse wavelet difference reduction to obtain low-frequency subband and reconstructing the high-frequency subbands using matrix multiplications. The high-frequency subbands were enhanced using high frequency obtained by stationary wavelet transform. The reconstructed low-frequency subband and enhanced high-frequency subbands were used to generate the reconstructed image using inverse discrete wavelet transform.
The visual and quantitative experimental results of the proposed image compression technique showed that the proposed image compression technique outperformed the wavelet difference reduction and JPEG2000 techniques.