Coding rate comparison of different methods in
Robust delivery of compressed video in wireless packet-switched networks is still a challenging problem. Video packets transmitted in wireless environments are often corrupted by random and burst channel error due to multi-path fading, shadowing, noise disturbance, and congestion in physical wireless channel.
To achieve an optimum transmission over a noisy wireless channel, both the source coding and network should be jointly adapted. An acceptable video quality in wireless environment can be obtained by the adjustment of parameters in video codec and wireless network. For the former, people have proposed many error resilient video encoding algorithms to enhance the robust performance of the compressed video stream in wireless networks. These algorithms can be divided into three categories: 1) error detection and error concealment algorithms used at video decoder of wireless receiver; 2) error resilient video encoding algorithms located at video encoder of wireless transmitter; 3) robust error control between video encoder and decoder based on 1) and 2). Fig.1 summarizes different techniques at different parts of a wireless video transmission system.
Since error concealment algorithms are only used at video decoder in wireless receiver, they do not require any modification of video encoder and channel codec. Hence, there is not any increase of coding computing complexity and transmission rate. Therefore, error concealment algorithms can be easily realized in present wireless video transmission system. However, since error concealment algorithms make full use of spatial and temporal correlation in video stream to estimate the corrupted region of video frames, when the correlation between corrupted region and correctly received frames is weak, error concealment algorithms cannot achieve good effect so that there is apparent distortion in repaired reconstructed video frames. In addition, although error concealment algorithms can reduce the intensity of temporal error propagation, it cannot reduce the length of temporal error propagation. As we know, human visual system (HVS) is not very sensitive to short term obvious error propagation while long term even slight error propagation will annoy the observation of HVS impressively. Therefore, desirable error repaired effect should make the intensity and length of error propagation minimum simultaneously.
In a practical wireless video transmission system, one entire frame is normally encapsulated into one video packet in order to make full use of limited wireless bandwidth. In this situation, any loss of one video packet would degrade image quality of successive frames in video decoder apparently since existing video standards utilize inter-frame prediction to make high compression efficiency. Hence, many error resilient methods have been developed to reduce the impacts of errors and improve the video quality in wireless video transmission in recent years [1-4]. However, most of the previously developed algorithms mitigate coding efficiency by adding redundancy to the video stream to enhance error resilient performance. As mentioned, real time wireless video applications are very sensitive to the increase of coding overhead in , which may not only result in additional delay that makes correctly received video packets invalid, but also deteriorate the quality of service in wireless environment especially in ad hoc networks . Therefore, it is necessary to make compressed video stream more resilient to errors at minimum expense of coding overhead.
In order to overcome the error propagation effect caused by video packet losses, long term memory motion-compensated prediction  is a reasonable way to suppress error propagation in the temporal domain at the cost of reducing the coding efficiency. In , the selection of reference frame in long-term motion compensated prediction is proposed for H.263 video with referring to the rate-distortion optimization (RDO) criteria. As a further work of , based on the original RDO model in error free condition, an error robust RDO (ER-RDO) method has been proposed in  for H.264 video in packet lost environment by redefining the Lagrange parameter and error-prone RD model. However, the ER-RDO method still requires a very high computational complexity to accurately determine the expected decoder distortion. To reduce the computational burden, Zhang et al.  developed a simplified version of the ER-RDO method by making full use of block-based distortion map to estimate the end to end distortion. Since the selected Lagrange parameters in these two methods are not precise enough to make corresponding rate distortion optimization, their cost for coding overhead for real time wireless video communication system is not desirable.
In the periodic frame method , a periodic frame is only predicted by previous
In this chapter, we develop a new channel based rate distortion (RD) model for error resilient H.264 video codec, which aims at minimizing the coding overhead increase while maintaining a good error resilience performance. In the new RD model, the practical channel conditions like packet lost rate (PLR) and packet lost burst length (PLBL), error propagation and error concealment effects in different reference frames are taken into consideration in analyzing the expected MB-based distortion at encoder. Moreover, for each reference frame, its corresponding Lagrange parameter is adjusted according to the variation of the channel based RD model, which can more accurately describe the relationship between coding rate and expected distortion at decoder in the sense of packet lost environment than other existing methods. Moreover, in our proposed new RD model, a proper intra-coded mode for error resilient performance is also considered. Therefore, more appropriate reference frame and encoding mode can be selected for each MB with the proposed method.
In the following of this chapter, a brief review on the error-robust rate-distortion optimization (ER-RDO) method is given in Section 2. The derivation of our proposed error resilient rate distortion (RD) optimization will be described in the same section. In section 3, the error resilient performance of the proposed method and some existing methods will be evaluated using computer simulations on H.264 video codec. Finally, some concluding remarks will be given in Section 4.
2. The proposed error resilience optimization method
As the latest video coding standard, H.264 has supreme coding performance by adopting lots of advanced techniques . With the rate distortion optimization (RDO) operation, H.264 achieves a very good coding efficiency and a high PSNR simultaneously in error free condition. For encoding
However, the cost in (1) doesn’t consider the distortion caused by error propagation and error concealment. Therefore, it cannot be directly used for finding the best reference frame and encoding mode in an error prone wireless packet-switched network if the channel condition is taken into consideration.
2.1. ER-RDO model
To take into account the packet lost effect, an error robust RDO (ER-RDO) method was developed in  by redefining the Lagrange parameter and error-prone RD model based on the practical wireless channel situation and potential decoded MB corrupted distortion. In the ER-RDO model, the expected overall distortion of
If we assume high-resolution quantization, the source distortion
where α and
where Δ is the quantization step size.
Referring to (5) and (6), the selected Lagrange parameter in ER-RDO model is computed as
With, (3) and (4), the best reference frame
From (4) and (7), we can find that the selected in each reference frame is identical when the number of reference frame and PLR is known. That is to say, the correlation between the coding rate and expected overall distortion for all reference frames is equal. However, as we know, when the distance between the selected reference frame and the present encoding frame turns to be longer, the probability of correct reconstruction of this frame at receiver is higher with the degradation of the coding efficiency. Therefore, the term in (8) is not accurate enough (a comprehensive interpretation will be given in next subsection). So in the sense of error resilience, the correlation between the coding rate and expected overall distortion for each reference frame at decoder should be different and be varied according to not only PLR and the range of reference frame, but also the distance between the selected reference predicted frame and the present encoding frame.
2.2. The proposed channel based RDO model
To overcome the problems of the ER-RDO model for H.264 video, we propose a new channel based RDO model to more accurately trade-off the coding efficiency and error resilient performance. For
The estimated cost for
For computing as in (11), it includes two parts: one is the error propagation distortion caused by
When the frames after present reference frame
So another part is the multiplying results of
With (9), (10) and (11), the final estimate cost for
So with the derivatives of
where we assume that the buffer length of reference frame
2.3. Implementation of reference frame and mode selection algorithm
With the results obtained before, we apply the proposed channel based RDO model to select the best reference frame and encoding mode in an H.264 encoder as follows. For one MB in P frame, it has two categories of encoding modes: intracoded and intercoded. Intracoded modes include direct coding,
where is best reference predicted frame in coding mode
And then, the best encoding mode
For the best intracoded mode
As the final results, the best encoding mode and its potential best reference predicted frame in the sense of optimized error resilience for
3. Experimental results
In this section, we evaluate the performance of the proposed channel based RDO model in terms of video quality and coding efficiency in wireless packet lost environment. In our experiments, we use H.264 JM 8.2 codec as test platform where video stream structure is selected as IPPP…. Three standard QCIF video sequences, namely Salesman, Susie and Foreman, are used in the simulations. The range of tested intracoded frames in these sequences is from 10th to 100th frame. Their QP is set as 28, their frame rate in H.264 JM8.2 is 30 fps, and their buffer of reference frames includes previous five frames. In order to make full use of wireless channel bandwidth, each compressed video frame is transmitted by a single packet. A simple error concealment method is used to make analysis of potential error propagation and error concealment effect at video encoder. When a MB is assumed to be lost, it will be replaced by the MB at same position in the previous error free frame. As a comparison, we use the original H.264 JM8.2 codec, the periodic frame method, the PMB method  and ER-RDO method  as reference algorithms. In addition, for the PMB method, we use PMB (11%), and PMB (22%) and PMB (33%) to denote the corresponding performance when the proportions of periodic MB in video frame are 11%, 22% and 33% respectively.
We first look at the error resilience performance of the proposed method by considering the PSNR performance of the reconstructed video under a packet loss environment. Fig.3 shows the error reconstruction effect of three test sequences using different methods when PLR = 0.1 and PLBL < 5. At each point in Fig.3, it is an average PSNR result when any reference frame of present encoding frame is lost.
It is shown in Fig.3 that the proposed method always achieves the best reconstruction effect for the three test sequences when compared with other methods. In Fig.3 (a), for Salesman sequence with low motion scene, the proposed method outperforms H.264 JM8.2, PMB (11%), PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 1.18dB, 1.14dB, 1.04dB, 0.8dB and 0.2dB, respectively. In Fig.3 (b), for Susie sequence with moderate motion scene, the proposed method performs better than H.264 JM8.2, PMB (11%), and PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 2.48dB, 2.03dB, 1.43dB, 0.13db and 0.21dB, respectively. In Fig.3 (c), for Foreman sequence with high motion scene, the proposed method achieves better results than H.264 JM8.2, PMB (11%), and PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 3.61dB, 3.04dB, 2.45dB, 1.72db and 0.53dB, respectively. As a conclusion, the proposed method can achieve more robust error resilient performance in different video scenes.
For evaluating the coding efficiency of different methods, we consider their impacts on overall coding rate requirement and PSNR performance of reconstructed video in error free environment. The simulation results for the three test sequences are listed in Table 1, 2 and 3 respectively. It is seen that all of the error resilient methods have little effect on original video quality. For fair comparisons, the PSNR performance of the reconstructed video is more or less kept constant for different methods. We then compare the coding rate required for each method.
|H.264 JM 8.2||35.57||39.6||40.14||56.83||0%|
|Periodic Frame FFFrame||35.54||39.61||40.19||60.08||5.72%|
|The proposed method||35.57||39.6||40.15||57.14||0.54%|
Table 1 shows the coding rate requirement of different methods for
|H.264 JM 8.2||37.26||43.54||43.28||95.56||0%|
|The proposed method||37.26||43.56||43.24||95.75||0.2%|
|H.264 JM 8.2||35.72||39.04||40.72||109.17||0%|
|The proposed method||35.72||39.05||40.76||111.28||1.93%|
As a conclusion with the results of error resilient performance and the coding efficiency, the proposed method can obtain not only more satisfying video reconstruction effect but also smaller coding rate increase than the reference methods.
Fig.4 shows the coding efficiency of the proposed method in different PLR from 0.01% to 0.1% of Foreman sequence. In Fig.4, we can find that the increase of coding rate using the proposed method is small when compared with that of H.264 JM 8.2 codec. Even in some instances of low LPR of Fig.4, the proposed method can achieve a slightly smaller coding rate than the original H.264 JM 8.2 codec.
As a further analysis on error resilient performance of the proposed method with respect to the PMB and ER-RDO method, Table 4, 5 and 6 give more detailed reconstruction PSNR (dB) effect comparison in Salesman, Susie and Foreman sequences when each of the reference frames in encoder buffer is lost. From the tables, we can find that the PMB method, especially PMB (33%) can achieve better results when the lost reference frame is far away from present encoding frame. On the contrary, ER-RDO can obtain better reconstruction effect when lost reference frame is near to present encoding frame. Our proposed method achieves a compromise between the two methods and obtains better average error reconstruction performance. In addition, it is always better than H.264 JM 8.2 when any reference frame in the encoder buffer is lost.
In this paper, an error resilient method based on the feedback of wireless channel condition is proposed for robust H.264 video stream transmitted in wireless packet lost environment. The proposed method can smartly adjust Lagrange parameter for each reference frame at encoder buffer by adopting proposed channel based RDO model. The modified Lagrange parameter can better reflect the association between the expected distortion and coding efficiency of video streaming in the sense of error resilience in packet lost environments. Comprehensive experimental results show that the proposed method sufficiently absorbs the advantages of existing methods and achieves better error resilient performance with minimum increase of coding overhead.
The work of J. Feng was supported by the Hong Kong Baptist University under Grant Number RG2/09-10/080. The work of K.-T.Lo was supported by the Hong Kong Polytechnic University under Grant Number G-YH58 and G-YJ29.