Optimized Scalable Image and Video Transmission for MIMO Wireless Channels

In this chapter, we focus on proposing new strategies to efficiently transfer a compressed image/video content through wireless links using a multiple antenna technology. The proposed solutions can be considered as application layer physical layer (APP-PHY) cross layer design methods as they involve optimizing both application and physical layers. After a wide state-of-the-art study, we present two main solutions. The first focuses on using a new precoding algorithm that takes into account the image/video content structure when assigning transmission powers. We showed that its results are better than the existing conventional precoders. Second, a link adaptation process is integrated to efficiently assign coding parameters as a function of the channel state. Simulations over a realistic channel environment show that the link adaptation activates a dynamic process that results in a good image/video reconstruction quality even if the channel is varying. Finally, we incorporated soft decoding algorithms at the receiver side, and we showed that they could induce further improvements. In fact, almost 5 dB peak signal-to-noise ratio (PSNR) improvements are demonstrated in the case of transmission over a Rayleigh channel.


Introduction
During the past decade, there has been exponential growth in various visual multimedia applications demand over wirelessly connected devices.Maintaining good visual quality for these applications is the central concern of service providers and system designers.Then, there is a critical need for efficient algorithms that guarantee good user visual quality after transmission over corrupted, bandwidth-limited, and non-static wireless links.
The conventional communication model is based on layered components where the application layer (APP) focuses on how to efficiently compress the visual content and the physical layer (PHY) aims at transmitting the compressed stream with residual error rates.In this context, the multimedia research community efforts lead to emerging algorithms and standards for image and video compression [1].On the other hand, wireless communication experts proposed new error correction and modulation methods combined with multiple antenna techniques (multiple-input multiple-output, MIMO) to decrease the error rates and enhance the system transmission capacity [2].
The aim of this chapter is to demonstrate that a joint optimization of the APP and PHY layers can improve substantially the system performance.Before presenting the main contributions, we give an overview of scalable image/video encoding and MIMO wireless communications, which are necessary to understand the rest of the chapter.

Scalable image and video coding
In communication theory, source coding is a basic operation that makes data compression because of the limitation in the channel capacity.In general, we apply lossless compression algorithms to reduce the redundancy and the correlation in the original data.Then, lossy compression can be applied to remove some useless information according to the human sensing behaviour.In this chapter, we mainly focus on visual content delivery, and we will treat the case of image and video source content.

Image and video source coding fundamentals
A static image contains many pixels that are correlated and redundant, and we speak about spatial correlation.The latter can be efficiently compressed based on three components: frequency domain transform, quantization, and entropy encoding.The most known frequency transforms used in image compression are the discrete cosine transform (DCT) and the discrete wavelet transform (DWT).Then, the quantization will assign a unique representation to a range of frequency.Finally, to compress the remaining redundancy between the quantized coefficients, entropy encoding makes lossless compression based on variable-length coding (VLC) or arithmetic coding (AC).
A video content can be seen as a time-evolving sequence of images.Then, we still have spatial correlation, and the three-phase image compression mechanism previously described is used for intra encoding where an image is compressed independently to the other ones.However, the neighbour pictures in a video have also many similarities, which we call temporal correlation.Then, to reach better compression rates, we can encode only the differences in the image with respect to a reference one.This is called inter compression, and the encoding process will involve motion estimation and compensation.Motion estimation will deliver the motion vectors between the current and the reference images, while motion compensation will help to compute the prediction error matrix.
To provide efficient delivery for different users with different quality requirements, while maintaining a single compression operation, scalable image/video encoders build a progressive stream with many quality layers.Every correctly decoded layer induces quality refinement in the reconstructed visual data.However, such a hierarchical structure induces different degrees of importance between the layers.In fact, if a layer is error corrupted, all the remaining layers will be useless for the reconstruction even correctly received.Then, the importance of a layer depends on its position within the stream.First layers are more important than the last layers.In the following, we will present some key image and video compression standards and focus on the scalable versions that will be used in the following.

Image compression standards
The first and most used image compression standard is Joint Photographic Experts Group (JPEG).The main advantage in JPEG is its simplicity; however, it remains non-efficient for high compression rates, and its compressed bitstream is very sensitive to transmission errors.In 2000, the JPEG committee proposed a new image compression standard called JPEG2000 [3].This standard uses the wavelet transform DWT and delivers a scalable stream with ratedistortion optimized quality layers.After the DWT, the different sub-bands are quantized and split into precincts, which are also split into code-blocks.The scalable content is generated based on the DWT resolution level, and the bit-plane level using the embedded block coding with optimal truncation (EBCOT) algorithm.The latter involves two steps: Tier 1 processing that makes the bit-plane processing and entropy encoding based on binary AC, and Tier 2 that organizes the final bitstream.JPEG2000 considers segment and synchronization markers, which reduces the quality loss in the presence of transmission errors, but still not sufficient for severe wireless environments.
An extension of JPEG2000 is proposed in Part 11 for image compression dedicated to wireless multimedia applications: JPEG2000 Wireless (JPWL) [4].This standard offers many tools to make the compressed bitstream more resilient against errors.One of these tools is the use of APP layer error correcting codes to protect the compressed data and the headers.The standard proposes the use of Reed Solomon (RS) codes over (2 8 ) and gives a choice between many RS codes with different error-correction capabilities.JPWL allows also the use of unequal error protection (UEP) that assigns different RS codes to the different code-blocks according to their importance for image reconstruction.JPWL gives also some error-resilient features like defining the sensitivity of a code-stream and localizing residual.In this chapter, we focus on image transmission in mobile wireless environments, then a resilient image compression method is needed which justifies the use of JPWL.

Video compression standards
H264 Advanced Video Coding (H264/AVC) is one of the standards proposed by the Joint Video Team to enhance the rate-distortion performance and to form a bitstream structure suitable for network transport.Compared to its predecessors, H264/AVC considers the same concept (intra coding and inter prediction) but adds new tools like more precise motion estimation, image prediction based on many references, spatial prediction for intra pictures, and more efficient entropy encoding called context-adaptive binary arithmetic coding (CAB-AC).For better interoperability, three profiles are proposed in this standard.The extended profile considers error resilience techniques which make it more suitable for wireless applications.Recently, Joint Collaborative Team on Video Coding (JCT-VC) developed a successor for H264/AVC.The new standard is called HEVC for High Efficiency Video Coding.Compared to H264/AVC, HEVC doubles the data compression ratio at the same level of video quality.
However, H264/AVC and HEVC do not support scalability, and their application to variable rate systems is not guaranteed.In this context, scalable extensions were proposed.The main expectation of H264 Scalable Video Coding (H264/SVC) [5] is to support temporal, spatial, and quality scalability with similar coding efficiency as a H264/AVC.The temporal scalability is performed by the SVC codec by introducing the concept of hierarchical B-picture coding.Interlayer prediction mechanisms are also introduced for spatial scalability, and quality scalability is performed using a medium-grain quality scalability (MGS).In this chapter, we mainly focus on scalable multimedia coding techniques, then we will consider the H264/SVC codec with quality and temporal scalability.

MIMO for wireless communications
As described, image and video codec designers aim at having the maximum compression rate under a given quality constraint.Some resilience techniques are applied to improve the reconstruction quality if some errors occurred.However, when we deal with wireless communications, the transmitted data are subject to many phenomena such as noise and largeand small-scale fading.The corrupted environment will degrade the reception quality, and consequently the user reconstructed quality will be bad.The second constraint imposed by the wireless channel is the limited bandwidth, sometimes not enough to transfer a multimedia content.Multiple antenna techniques, also called MIMO, exploit the time, frequency, and spatial diversity inherent to the wireless channel to increase the transmission rate.The gain induced by the MIMO technology motivated its integration in the recent wireless communication standards such as 802.11nWiFi, 4G LTE, and 802.16eWiMAX.This section is devoted to the description of the basics of the MIMO technology.

MIMO channel modelling
Let us consider a MIMO system with transmitting antennas and receiving antennas.The link between a transmitting antenna i and a receiving antenna j is characterized by its complex gain denoted ℎ .Then, every receiving antenna j will have the contribution of the signals transmitted by the antennas as: where is the symbol transmitted by the antenna i, and is the noise component.Finally, the general system can be formulated by a matrix operation as: where y, s, and n vectors represent, respectively, the received signals, the transmitted signals, and the noise components.The matrix H is called the channel matrix because it describes the gain of all the links between the transmitting and receiving antennas.Estimating the channel matrix H delivers the channel state information (CSI) that can be exploited by the encoder or the decoder as detailed in the next paragraph.

Open-loop and closed-loop MIMO systems
The MIMO system is called open-loop when the CSI is only exploited at the decoder to improve the demodulation and decoding performance.We distinguish two main categories in this context.The first is spatial multiplexing where the information is multiplexed on the different antennas, which helps improve the system capacity.The second type aims at improving the system resiliency by exploiting the space and time diversity.This can be achieved if the antennas transmit different versions of the same information at different times.We notice that these two strategies imply a quality capacity trade-off.Then, some hybrid optimized versions were proposed to have the best compromise [6].
Closed-loop MIMO (CL-MIMO) systems take advantage of a feedback channel that makes possible the use of the CSI at the transmitter.In fact, the channel information enables precoding that jointly optimises the transmitter and the receiver operations.Hence, the system can reach the best resilience and capacity improvements since multiplexing and diversity are now optimized at the transmitter.Moreover, the CL-MIMO precoder virtually subdivides the multiple antenna channels into independent parallel single antenna channels.
Many precoding techniques can be found in the literature [6].However, the most used ones are linear precoders that optimize the transmitting power to reach a quality criterion (maximum capacity, lower error rates, maximize the signal-to-noise ratio…).If the precoding matrix is diagonal, we call it a diagonal precoder.The optimization process [7] generates an equivalent diagonal matrix for H, which means an equivalent virtual channel with multiple independent single antenna channels.Moreover, based on precoding, we can assign different powers to these virtual subchannels.This strategy will be called unequal power allocation (UPA) and cannot be applied in the case of open-loop MIMO (OL-MIMO) since the transmitter does not have access to the CSI.These advantages motivated the use of CL-MIMO technology with linear precoding in this chapter.
We notice that MIMO systems require a multipath propagation environment for spatial diversity.However, this will introduce inter-symbol interference (ISI), which justifies the use of the orthogonal frequency-division multiplexing (OFDM) modulation.The latter divides the wide band into many narrow band subchannels, which reduces the ISI phenomena.All the MIMO systems described in the following rely on OFDM modulation.

MIMO receivers
In MIMO systems, each receiving antenna collects different interfering signals coming from the transmitting antennas.In order to reconstruct the source symbols, we need to separate them.Many decoding techniques can be applied like zero-forcing, successive interference cancellation, minimum mean-square error estimation, and maximum likelihood.The latter delivers the best performance with the minimum possible error rates.In the case of Gaussian distributed noise, the ML delivers the estimate s as: The problem with the ML decoding is its prohibitive complexity that grows exponentially with .However, in this work, we will consider it with linear MIMO precoding generating independent single antenna channels, which reduces significantly its complexity.

Context and outline of the chapter
We described in the previous subsections the developments made by two different research communities in the fields of image/video compression and MIMO wireless communications.We also justified the chosen standards and technologies in this work and emphasized the need for a joint optimization of compression and transmission operations.Indeed, the multimedia compressed data are very sensitive to transmission errors, and the source encoder cannot take into account the fluctuating behaviour of the wireless channel.On the other hand, the MIMO transmission strategies do not care about the content, i.e. important and less important streams are transmitted in the same way.During the last years, APP layer image/video compression, and PHY layer wireless communications, began to converge to guarantee a dynamic access to the multimedia services over corrupted channels.However, the convergence of the multimedia world and the mobile communications raised new questions.How can we satisfy users with heterogeneous scenarios and using wireless channels varying in space and time?How can we improve the end-user visual quality without affecting the wireless system rate?The aim of the proposed book chapter is to answer these issues by proposing APP-PHY cross-layer algorithms based, respectively, on link adaptation and soft decoding.The details and the contributions of the chapter will be provided in the next section after the description of the state-of-the-art.
The chapter is organized as follows.Section 2 provides a state-of-the-art study where the main contributions dealing with the joint design of image/video compression and transmission are studied.Section 3 provides the main contributions of the chapter.After presenting the used channel models, we give the simulation results for image and video optimized transmission over MIMO channels.Then, a soft-input decoding method will be presented and investigated for JPWL image transmission.Finally, Section 4 concludes the paper and gives open directions for future work.

APP-PHY cross-layer design: state-of-the-art
In the previous section, we demonstrated the need for a joint optimization of the image/video compression and wireless communication operations.This requirement motivated researchers and academia to develop new APP-PHY cross-layer algorithms.Some of them optimized the error-correcting coding process; others focused on the modulation, or on the MIMO precoding.
In this chapter, we target to optimize all the operations for a better reconstruction quality.To better illustrate the framework of our contribution, a description of the main state-of-the-art algorithms in the context of APP-PHY cross-layer design is provided.

Joint source-channel coding (JSCC)
Motivated by the well-known Shannon [8] separation theorem, the communication system designers have conceived separately source and channel coding.However, in most practical applications, it is impossible to fulfil the theorem requirements like unconstrained block lengths and unconstrained coding and decoding delays.Joint source/channel (JSC) coding and decoding techniques have emerged as a pragmatic approach.First solutions in this context tried to integrate some resilience modes into the source coding, which results in compression efficiency loss [9].Other solutions tried to exploit the residual redundancy remaining after source coding to improve the decoding performance.Being the last block in every image/video compression scheme, many works focused essentially on the entropy encoding operations such as variable-length coding (VLC) and arithmetic coding (AC).
Motivated by the efficiency of the error-correcting codes, many researchers focused on developing new entropy decoding algorithms that exploit the code properties to enhance the system decoding performance like in [10,11] for VLC and in [12] for AC.Then, a change was marked by the development of soft-input soft-output (SISO) channel decoders used in the very efficient turbo codes.JSC research community focused on developing SISO decoders for entropy codes.First solutions, inspired by the convolutional codes, modelled the entropy encoder by a finite-state machine or a trellis to apply conventional SISO channel decoding algorithms.The case of VLC was treated in the study of Wang et al. [13] and Park and Miller [14], and decoding methods for AC were considered in the study of Grangetto et al. [15] and Bi et al. [16].Later contributions took benefit from the existing SISO decoders and applied iterative JSC decoding [17][18][19].In Refs.[17] and [18], the authors considered specific trellis constructions, but their complexity becomes intractable for long source sequences.Recently, a new SISO entropy decoding was proposed for VLC [19] and AC [20].The proposed algorithm was inspired from the Chase II decoding first used in turbo block codes and showed a good complexity-performance trade-off compared to trellis-based methods.In the present chapter, a soft-input decoding method will be needed to enhance the reconstruction quality of an image JPWL codec.Hence, we will consider the Chase decoder [19,20] for soft-input arithmetic decoding.More details are provided in Section 3.4.

Unequal error protection (UEP)
In many studies, authors focused on optimizing the source and channel coding operations.This still can be considered as a JSC coding method, but where the source and the channel encoding and decoding algorithms remain unchanged.In fact, we just optimize their parameters to achieve a target constraint.We showed in Section 1.1 that the scalable image/video compressed bitstream contains information with different levels of importance.Hence, applying equal error protection implies the same correction performance for important and less important information, which is not accurate.It is more suitable to apply unequal error protection where important parts are more protected than less important ones.
Many solutions focused on applying UEP to JPEG2000 compressed streams since it provides scalability.Then, first quality layers have to be more protected than the last quality layers.In Refs.[21] and [22], the authors proposed different strategies for JPEG2000 stream headers protection and the application of UEP using RS codes on the different JPEG2000 quality layers.Substantial improvements in terms of image peak signal-to-noise ratio (PSNR) were demonstrated, which motivated their integration in the JPWL standard.The application of UEP to video transmission was investigated in [23] where different rate-compatible punctured convolutional (RCPC) codes were used to protect the MPEG-2 compressed video packets.We notice that these schemes assign different codes to the streams without guaranteeing ratedistortion optimality.To reach such a property, optimization process has to be included.In this context, a rate allocation process was introduced in [24] to minimize the distortion of a reconstructed JPEG2000 image.In Ref. [25], the authors derived an optimal wireless JPEG2000 compliant error correction rate allocation scheme for robust streaming of images and videos.In Ref. [26], the authors used the scalability property of the H264/SVC encoder to apply convenient UEP strategy.Even efficient, the proposed strategies considered a static channel with fixed parameters; then, we have no guarantee to obtain the same results in a MIMO varying wireless environment.Some works [27,28] focused on optimizing the JPEG2000 compression and protection processes for open-loop MIMO systems.Extending these UEP results to the case of closed-loop MIMO systems can be very advantageous.Moreover, we will have another freedom degree that we can optimize to guarantee efficient rate-distortion tradeoff.This will be the topic of the next subsection where unequal power allocation methods are described.

Unequal power allocation (UPA)
We can also improve the transmission quality of the more important packets in the compressed stream by improving their signal-to-noise ratio (SNR).For a given channel, this can be achieved by allocating more transmission power to them.Under a maximum power constraint, low transmission power will be assigned to the less important packets.This unequal power allocation (UPA) strategy can also improve significantly the image/video reconstruction quality.
In this framework, many researchers [29][30][31][32] proposed to allocate dynamically the power to the different packets according to their contribution in the quality improvements.While contributions proposed in Refs.[29,30] focused on JPEG2000 image content, the authors in Ref. [32] designed a UPA strategy for a scalable video content.These contributions were proposed for single antenna systems, and the extension to high capacity MIMO systems can be gainful.Hence, in Refs.[33] and [34], the authors focused on optimizing OL-MIMO spacetime diversity exploiting systems to improve the image reconstruction quality.A novel precoding scheme capable of integrating both channel and source characteristics in order to achieve the desired prioritized spatial multiplexing was proposed in [35] for H264/SVC compressed-video transmission.
We recall that in CL-MIMO systems, we can construct, based on the channel matrix, equivalent independent single antenna subchannels with different propagation properties.Then, the image/video transmission can be optimized more efficiently to achieve the best quality at the receiver.In this chapter, we consider that the encoder knows the CSI, and we present a qualityconstrained precoding method.

Hierarchical and adaptive modulations
As scientists focused on optimizing error correction in UEP, or transmit power in UPA, many other proposals treated the modulation process.The principle is the same: efficiently assign modulation methods to the different compressed information according to their contribution to the quality improvement.
In hierarchical modulation [36], each constellation point is assigned to a base layer and an enhancement layer streams.The bit-symbol mapping is made in a sense where the distance between the points with different base-layer bits is larger than that for the enhancement layer bits.Hence, in bad channel conditions, the base layer stream will be more resilient and can be decoded with low errors, and the base quality is guaranteed.If the channel is fair, we will be also able to decode the enhancement-layer stream and have better quality.
Another simple strategy can be considered to apply dynamic modulation that assigns different resiliency levels to the compressed data, which is adaptive modulation.Actually, using highorder modulation results in a better spectral efficiency, but higher error rates.Then, if we need to have low error rates for a very important information transmission with a noisy-channel, we can just use low-order modulation.While UEP and UPA strategies tend to guarantee a required user quality for a low channel SNR, adaptive modulation can help to improve the spectral efficiency for a channel with high SNR.This motivated the development of hybrid strategies where the described schemes can be optimized for a better system efficiency.

Contributions: hybrid optimized strategies
When the target is to deliver an image or video content over a wireless link, the system designer should take into account that the channel is varying.Hence, we should move from a static design to an adaptive one.To this aim, many link adaptation algorithms based on a hybrid optimization of the previously described methods were proposed.UEP was combined with hierarchical modulation in [37] in the case of equal power OFDM-MIMO wireless transmission.In Refs.[38] and [39], the authors proposed UPA strategies combined with adaptive modulation, respectively, for non-compressed and compressed image transmission, but the error correction scheme remains static.In Ref. [40], a link adaptation strategy where coding, modulation, and power are all optimized according to the channel CSI is proposed; however, the strategy does not take into account the content structure.The link adaptation strategy proposed by Houas et al. [41] considers an OFDM single antenna system where the modulation and error protection tasks are optimized based on the channel subcarrier status under a constrained JPEG2000 image quality.
In this chapter, we introduce a new paradigm where we develop a core system optimizing all the operations based on UEP, UPA, and adaptive modulation to achieve the best-reconstructed image/video quality, given the channel status.Moreover, we provide soft-input source and channel decoding to reduce the error rates at the receiver and consequently improve the user experience.
The chapter contributions are threefold.First, we show how the optimization of the system parameters (image/video compression and wireless transmission) can be very advantageous to guarantee a good quality-of-service (QoS) even the channel is varying.We also demonstrate that this is possible by using scalable techniques such as UEP, UPA, adaptive modulation, and scalable compression.Second, we demonstrate the efficiency of the proposed optimization procedure for JPWL image compressed data and for H264/SVC video compressed content.Third, we show that using soft decoding methods for the demodulator, the channel decoder, and the JPWL image decoder can provide significant quality gains while keeping the same throughput.The results are provided based on simulation results of the visual objective and subjective quality.This seminal chapter resumes many results [42,43] developed in the XLIM-RESYST team of University of Poitiers, France, and some of them [44] were under a cooperation with the SYSCOM laboratory in Tunisia.

Channel model
In this chapter, we consider two different channel models to run simulations: a statistical channel and a realistic channel.At large, the statistical channel model is used to emphasize the performance as a function of a given SNR, whereas the realistic channel model gives more details about the propagation environment of a given scenario with fluctuating transmission conditions.
The communication wireless channel induces random disturbances due to the thermic additive noise and low and large-scale fading, respectively, caused by obstacles and multipath propagation.These random phenomena can be described based on statistical models.In the following, we consider a Rayleigh fading channel where the elements of the channel matrix follow a Rayleigh distribution.The noise is modelled by the well-known additive white Gaussian noise (AWGN).In the case of a single antenna system, if we transmit a binary phaseshift keying modulated sequence = 1 , …, over a Rayleigh channel, we will receive a sequence = 1 , …, whose elements are = .+ where is random Rayleigh distributed coefficients and represents random AWGN samples.
To have a more realistic approximation for the propagation environment, we consider a threedimensional (3D) ray-tracing simulator [45] to provide the impulse responses of a realistic channel.The transmission environments used in this chapter take into account the user mobility and the existence of obstacles, and will be presented later.Then, the channel alternates between bad, medium and good states.In the case of realistic channel simulations, the CSI is obtained based on an estimate of the channel with a training sequence.

Compressed image transmission application
In this section, we focus on image transmission over CL-MIMO wireless systems.As previously specified, we consider the error-resilient scalable JPWL image compression standard.
Two contributions are presented.The first considers equal error protection, and a fixed modulation scheme with a new CL-MIMO context-based precoder (CBP) optimized to reach the best reconstruction quality.In the second step, the precoder will be introduced into a link adaptation scheme where UEP, adaptive modulation, UPA, and source coding are optimized given the channel status to reach a better quality.

Context-based precoding for JPWL image transmission
In this part, we suppose that the channel coding and modulation are static.However, the power allocation will be optimized.The system model treated in this part is depicted in Figure 1 and aims at transmitting the compressed data using a precoded MIMO system.After diagonalization, the equivalent channel matrix, given by the CSI, will have b virtual independent single antenna channels with different SNRs.Then, the scalable JPWL encoder will be asked to generate b quality layers that should be transmitted across the b virtual channels.Naturally, the first quality layer, which is the most important, will be assigned to the highest SNR subchannel.Then, according to their importance order, every quality layer will be assigned to a specific subchannel.The JPWL standard includes RS , error correcting codes that assign an encoded N-symbols codeword to every K-symbol input message, resulting in a = symbol correction capacity.The encoded bit-sequence will be then modulated based on M-QAM, which assigns to every n bits one of the = 2 modulation symbols.Finally, the CBP power allocation operation has to compute the power assigned to every quality layer i to reach a source target bit error rate denoted given the RS code and the modulation parameters, and eventually the channel CSI.
With scalable image coding, a quality layer can enhance the reconstruction quality under the condition that it and its previous quality layers were correctly received.Otherwise, it will be useless.Then, the proposed power allocation strategy works hierarchically: every quality layer i transmitted on a subchannel i should take enough power to achieve the bit-error rate (BER) target constraint.However, we should take into account that we have a maximum power 0 to not exceed, and that the error rates depend on the modulation and channel coding parameters.For every subchannel ∈ 1, …, , the precoding coefficient is evaluated in four steps: The target source BER is defined after RS error correction; then, we have to compute the corresponding BER before correction denoted B, which depends on the RS code correction capacity as: ( ) ( ) Given the modulation order M and the noise power of the subchannel i denoted ², we can determine the needed power precoding coefficient to achieve the needed BER B as: Finally, we have to check if the remaining power can satisfy the precoding result.Hence, if 2 ≥ , we still have power to transmit the quality layer i with the requested BER target; then, we have to update the remaining power to = − 2 .Then, we iterate the same process for the next layers.However, if this condition does not hold, i.e.
2 < , we will assign all the remaining power to the current quality layer 2 = that will be transmitted with no guarantee to reach the target BER.
To investigate the efficiency of the CBP precoder, we run simulations where an image is transmitted for every receiver position in the trajectory given in Figure 2 where the red blocks are buildings.The receiver is supposed to move along a path of 138 m at a speed of 5 m/s.The channel gain shows four different areas with different channel states.We consider a MIMO 4 × 4 channel where the diagonalization results in a maximum of = 4 subchannels.Then, the JPWL source encoder will compress the test image "Monarch" and deliver four quality layers with 0.25 bits per pixel (bpp) each.We use an equal error protection method with a fixed RS (37,32) code and a fixed M-QAM modulation with = 4. Figure 3 shows the PSNR results as a function of the receiver position for different precoding strategies.For better readability, we present the mean value over 20 samples with a sliding window for all the schemes.The presented precoding methods have the same principle with different optimization constraints.While water filling (WF) aims at maximizing the channel capacity, minimum mean square error (MMSE) tends to minimize the mean square error − ² , and MBER stands for minimizing the BER.E-d min is a non-diagonal precoder that focuses on maximizing the distance between the constellation points.All these precoders are compared to the proposed CBP precoder that aims at maximizing the image quality by exploiting the hierarchical structure of the JPWL compressed stream.The PSNR results show that the different precoders involve different performance according to the channels status.For the first area, where the channel is very corrupted, all the precoders dispatch the power between the different layers which induces a quality loss.CBP is the best in this area, since it allocates approximately all the power to transmit correctly the first quality layer, which results in a mean PSNR of almost 28 dB.For medium channel states (areas 2 and 4), the CBP remains more efficient than the other precoders, even for certain positions where the E-d min is better.The E-d min gain with respect to the CBP is more clear for the low corrupted channel (area 3), and this is justified by the non-diagonality of this precoder that allows transmitting the four quality layers.However, the CBP will assign almost all the power to the three first quality layers.In the following, we will consider link adaptation strategies, and the CBP performance will be improved substantially for this area.
To better show the gains induced by the CBP precoder for high-to-medium corrupted channel state, we present in Figure 4 the reconstructed images and the corresponding PSNRs to study the visual quality for the position index 2259.The results confirm the efficiency of the CBP precoder compared to the conventional MIMO precoders in the case of JPWL image transmission.However, the question remains for the area 3 where the channel is fair, and the CBP performance can be improved.To this aim, we propose in the next paragraph a link adaptation process based on optimized UEP and adaptive modulation.

Optimized CBP and link adaptation for JPWL image transmission
In the previous part, we presented a system where the source rate, modulation, and error correction are static and independent from the channel state which is not accurate.In this paragraph, we will use link adaptation techniques as presented in Figure 5.The main difference compared to Figure 1 remains in the new joint optimization block using the CSI as input to deliver the number of subchannels to use ≤ , and, for each i th quality layer, the best configuration for the source coding rate , the correction capacity of the RS code given by , and the corresponding modulation order .The optimization process [42] makes independent tree-based exhaustive search for every subchannel.The objective is to minimize distortion under three main constraints, which are the rate constraint, the target BER quality to guarantee, and the maximum power to not exceed.Now, we run simulations to investigate the gains induced by the link adaptation technique in the case of 4 × 4 MIMO for the same realistic channel.We also assume the same channel estimation process.The reference static configuration for the test image "Monarch" 768 × 512 pixels considers a JPWL encoder generating four quality layers having each a constant rate of 0.125 bpp.The error correcting code is a RS (37,32), and the modulation is 4-QAM.The link adaptation optimized system always considers a 4 × 4 MIMO and can choose dynamically the number of subchannels to use ∈ 1, 2, 3, 4 , the modulation order for each subchannel M i ∈{4, 16, 64}, the corresponding RS encoded sequence length N i ∈{37, 38, 40, 43, 45}, and source encoding rate R Si .The power limitation is set to P 0 = 1, the objective BER to achieve is BER t = 10 −9 , and the rate is constrained by a maximum of 512 OFDM symbols per subchannel.We recall that we always apply the CBP precoding process of the previous paragraph, which is activated after fixing all the system configurations.
Figure 6 provides the PSNR results for the static and the optimized CBP strategies as a function of the receiver position index.We can see that the link adaptation induces remarkable improvements in terms of reconstructed image quality.Moreover, almost a 1 dB mean PSNR gain can be achieved for the intermediate channel state at the areas 2 and 4. The improvements are even more significant when the channel is under good conditions.This is justified by the fact that the link adaptation will allow the use of high order modulations with low redundancy, and consequently higher source coding rates and better quality.The mean PSNR gain can reach 5 dB.We demonstrated that using APP-PHY cross-layer design by optimizing all the system blocks according to the channel status can enhance remarkably the reconstructed image quality.Now, we propose to investigate if this efficiency remains when dealing with scalable video transmission.

Compressed video transmission application
In this section, we propose to extend the link adaptation with the CBP MIMO precoding algorithms to scalable H264/SVC video transmission application.In fact, this standard generates three scalability layers.The first deals with temporal scalability where bi-directional (B) frames are adaptively appended to a base layer group of pictures (GOP) which means variable frame-rates.The second is spatial scalability which aims at satisfying users with different displaying capacities by generating many resolutions for every frame.The last is the quality scalability which transports complementary data in different layers to produce videos with distinct quality levels.This scalability is mainly based on implementing distinct quantization parameters for each layer.H.264/SVC supports three distinct quality scalability modes which are fine, medium, and coarse grain scalabilities (GS).While coarse GS (CGS) makes a prediction process for each quality layer, the medium GS (MGS) increases efficiency using a flexible prediction unit, where base and enhancement layers can be referenced.Finally, the compressed stream has a hierarchical structure, and we can apply the CBP with the link adaptation as described previously.
The studied system model is the same as in Figure 5 with a H264/SVC source encoder.Simulations are operated on the "Foreman" 176 × 144 resolution test video.The source encoder generates a base quality layer, with three CGS quality enhancement layers.For a better scalability, we also consider subquality layers computed by the MGS process.The compressed bit-streams are then protected using a rate-compatible punctured convolutional (RCPC) channel code, then modulated with M-QAM where M is adaptively optimized.Finally, the CBP precoding is applied for the 4 × 4 MIMO channel UPA.To approach the reality, a realistic channel depicted in Figure 7 is considered with the corresponding channel gain.The receiver is supposed to move through a path of 20 m at a 5 m/s speed.We can see that we have two channel states: the first part has a poor non-line-of-sight (NLOS) channel, then by the end we have a relatively reliable status with LOS propagation., the modulation order ∈ 4, 16, 64 and the source coding rate and applies CBP precoding with a BER target = 10 −9 to guarantee reliability for the more important quality layers.All this process aims at maximizing the user video quality under the constraints: maximum power 0 = 1, equivalent overall transmission rate, and a minimum required QoS.The figure shows that applying CBP with link adaptation (red curve) is always better than using static CBP whatever the modulation order and coding rate.In fact, all the presented results have the same maximum rate.Using static CBP with = 16 and = 1 4 results in very high error rates especially for the NLOS area, which degrades remarkably the image quality.Then, the system using the lowest order modulation, with the RCPC code rate = 1 2 , is more robust and makes better error correction, which justifies its efficiency in a very noisy channel state.However, when the channel is fair, this configuration loses its efficiency because of its low spectral efficiency, and the high redundancy level.The CBP with link adaptation delivers a minimum PSNR quality of 34 dB, which is very acceptable, and can reach 37 dB for a good channel state.The results confirm that making joint optimization of the system parameters is also advantageous for scalable video transmission.Finally, sample frames are provided in Figure 9 to show these video quality improvements.

Soft-decoding methods for image transmission
As specified in Section 2.1, joint source channel decoding is also a good solution to improve the image/video reconstruction quality.While CBP and link adaptation focused on how to transmit efficiently the image/video compressed information, soft decoding algorithms can enhance the quality of the reconstructed data without introducing extra redundancy.The aim of this section is to improve the performance of JPWL image transmission over highly corrupted noisy channels.
We focus on a system where the image is compressed with a JPEG2000 encoder, then protected by a RS code, and finally modulated and transmitted on single antenna noisy channel.Unlike conventional hard decoding methods, no decision has to be made on the received samples.The latter gives an extra information about the decision reliability, which can be exploited to further improve the system decoding performance.Algorithms that exploit such extra information are called soft decoding methods.To achieve a maximum gain, soft decoding methods will be used for RS decoding and JPWL arithmetic decoding.
Let us consider a statistical Rayleigh fading channel with binary-phase shift keying (BPSK) modulation.Based on the channel soft outputs, we can apply soft-input RS decoding using the well-known Chase decoder.Since we need also to make soft-input arithmetic decoding, the RS decoder should deliver an estimate for the probabilities of its decoded sequence elements.This can be made using a soft-input soft-output (SISO) RS decoder.The most used RS SISO decoder is the Chase II algorithm [46] where different test sequences are built by switching all the binary combinations over the least reliable bits.After decoding all the test sequences, the decoded sequence d will be the one having the minimum Euclidian distance with reference to the received sequence.Finally, soft outputs are computed bit-by-bit based on the difference between d and the valid competing sequence.We notice that increasing results in a better decoding performance and also a more accurate soft-outputs computation, but higher complexity.We recall that the JPEG2000 encoder uses two main components after the wavelet transform and quantization.Tier 2 arranges the compressed stream, and Tier 1 makes lossless arithmetic compression using MQ encoder.Then, we will have to re-arrange the reliabilities delivered by the RS SISO decoder to reconstruct the MQ-decoder soft inputs.A soft-input decoding can be applied based on a modified Chase decoder.The main differences between the RS and the MQ Chase decoding operations reside in the error detection mechanism.While RS Chase decoder focuses on valid code-words based on redundancy, the MQ decoder uses the variable-length encoding property to detect invalid sequences.Finally, the decoded sequences will be used to reconstruct the original image.
To investigate the proposed decoder performance, we use the grayscale test image Lenna 512 × 512 pixels.The latter is compressed with a JPEG2000 encoder to a bpp source rate.The obtained packets are then protected using a RS (37,32), to achieve an overall rate of 1 bpp.When received with no errors, the image reconstruction leads to a maximum PSNR of 39.3 dB.However, in the case of corrupted channels, the results should depend on the SNR.
Figure 10 presents the evolution of the mean reconstructed image PSNR as a function of the channel SNR for different decoders having equivalent 1 bpp overall rate.It is obvious that the results are better for an increasing SNR, but the curves are not the same.Moreover, if no RS channel coding is used (green and red curves), the system performance is very low; however, the soft-input arithmetic decoding induces almost a 3 dB PSNR improvement.Using the errorcorrecting RS code with hard decoding (blue curve) improves the system performance remarkably.Indeed, using the soft decoding algorithms can make extra improvements.In fact, when applying soft-input RS decoding while keeping hard MQ decoding, we can achieve almost a 12 dB gain at / 0 = 10 dB.Furthermore, using soft decoding algorithms for RS and MQ decoding can further improve the image quality by 5 dB at the same SNR.Finally,  We investigated in this section the main improvements we can reach by using soft decoding algorithms for a single antenna system.The next steps include the extension of these results to a CL-MIMO system using CBP precoding with link adaptation, which is under investigation.

Conclusions and future work
In this chapter, we showed that cross-layer APP-PHY design could be very advantageous to guarantee a good image/video quality even after transmission over a highly corrupted and varying MIMO wireless channel.We demonstrated that this is possible by making accurate optimization of different scalable techniques such as UEP, UPA, adaptive modulation, and scalable image/video compression.Moreover, we established the efficiency of the proposed optimization procedure for JPWL image compressed data, and for H264/SVC video compressed content.Finally, we emphasized that using soft decoding methods can provide remarkable quality gains while keeping the same throughput.
Future work includes the extension of the optimization process to more general networks including cooperative communication and wireless multimedia sensor networks.On the other hand, the proposed contributions are validated for JPEG2000 and H264 encoders, and they can be generalized to new standards like HEVC.

− 2 Figure 1 .
Figure 1.System model for JPWL image transmission over CL-MIMO system using UPA based on the CBP algorithm.Source coding, error correction, and modulation are not optimized.

Figure 2 .
Figure 2. The wireless propagation environment (left) and the corresponding variation of the MIMO channel gain by position (right) for JPWL image transmission.

Figure 3 .
Figure 3. PSNR evolution as a function of the receiver position for different 4 × 4 CL-MIMO precoding strategies in the case of "Monarch" image with 4-QAM modulation and RS(37,32) channel code.

Figure 4 .
Figure 4. Visual quality for the 4 × 4 CL-MIMO precoding strategies in the case of "Monarch" image with 4-QAM modulation and RS(37,32) channel code.The results correspond to the position index 2259 in area 4.

Figure 5 .
Figure 5. System model for scalable content transmission over CL-MIMO system using UPA based on the CBP algorithm and link adaptation techniques based on UEP, adaptive modulation, and variable rate source encoder.

Figure 6 .
Figure 6.PSNR evolution as a function of the receiver position for 4 × 4 CL-MIMO with CBP precoding and link adaptation strategies in the case of "Monarch" image.

Figure 7 .
Figure 7.The wireless propagation environment (left) and the corresponding variation of the MIMO channel gain by position (right) for H264/SVC video transmission.

Figure 8
Figure 8 presents the simulation results for CBP precoding with and without link adaptation.In fact, the considered system uses an APP-PHY cross layer design based on UEP, adaptive modulation, UPA, and variable rate source coding.The optimization core selects for each subchannel the good RCPC code rate among the set ∈ 4 5 , 2 3 , 1 2 , 1 3 , 1 4

Figure 8 .
Figure 8. PSNR evolution as a function of the receiver position for 4 × 4 CL-MIMO with CBP precoding and link adaptation strategies in the case of "Foreman" video.

Figure 9 .
Figure 9. Example of reconstructed frames for Foreman video for CBP precoding with link adaptation ((a) and (c)) compared to a static scheme with 16-QAM modulation and rate 1/2 channel code ((b) and (d)).

Figure 10 .
Figure 10.PSNR evolution as a function of the signal-to-noise ratio for the JPWL image transmission over a Rayleigh channel with soft decoding algorithms.

Figure 11
Figure 11 provides sample reconstructed images for a signal-to-noise ratio of 0 = 10 for conventional hard decoding and the presented soft decoding methods.It is obvious that the PSNR gain is also justified by a remarkable visual quality improvement.

Figure 11 .
Figure 11.Examples of the reconstructed images in the case of hard and soft decoding.