Open access peer-reviewed chapter

Error Resilient H.264 Video Encoder with Lagrange Multiplier Optimization Based on Channel Situation

By Jian Feng, Yu Chen, Kwok-Tung Lo and Xu-Dong Zhang

Submitted: April 9th 2012Reviewed: September 7th 2012Published: January 9th 2013

DOI: 10.5772/53149

Downloaded: 1580

1. Introduction

Robust delivery of compressed video in wireless packet-switched networks is still a challenging problem. Video packets transmitted in wireless environments are often corrupted by random and burst channel error due to multi-path fading, shadowing, noise disturbance, and congestion in physical wireless channel.

To achieve an optimum transmission over a noisy wireless channel, both the source coding and network should be jointly adapted. An acceptable video quality in wireless environment can be obtained by the adjustment of parameters in video codec and wireless network. For the former, people have proposed many error resilient video encoding algorithms to enhance the robust performance of the compressed video stream in wireless networks. These algorithms can be divided into three categories: 1) error detection and error concealment algorithms used at video decoder of wireless receiver; 2) error resilient video encoding algorithms located at video encoder of wireless transmitter; 3) robust error control between video encoder and decoder based on 1) and 2). Fig.1 summarizes different techniques at different parts of a wireless video transmission system.

Since error concealment algorithms are only used at video decoder in wireless receiver, they do not require any modification of video encoder and channel codec. Hence, there is not any increase of coding computing complexity and transmission rate. Therefore, error concealment algorithms can be easily realized in present wireless video transmission system. However, since error concealment algorithms make full use of spatial and temporal correlation in video stream to estimate the corrupted region of video frames, when the correlation between corrupted region and correctly received frames is weak, error concealment algorithms cannot achieve good effect so that there is apparent distortion in repaired reconstructed video frames. In addition, although error concealment algorithms can reduce the intensity of temporal error propagation, it cannot reduce the length of temporal error propagation. As we know, human visual system (HVS) is not very sensitive to short term obvious error propagation while long term even slight error propagation will annoy the observation of HVS impressively. Therefore, desirable error repaired effect should make the intensity and length of error propagation minimum simultaneously.

Figure 1.

Error resilient methods used in packet-switched wireless networks

In a practical wireless video transmission system, one entire frame is normally encapsulated into one video packet in order to make full use of limited wireless bandwidth. In this situation, any loss of one video packet would degrade image quality of successive frames in video decoder apparently since existing video standards utilize inter-frame prediction to make high compression efficiency. Hence, many error resilient methods have been developed to reduce the impacts of errors and improve the video quality in wireless video transmission in recent years [1-4]. However, most of the previously developed algorithms mitigate coding efficiency by adding redundancy to the video stream to enhance error resilient performance. As mentioned, real time wireless video applications are very sensitive to the increase of coding overhead in [5], which may not only result in additional delay that makes correctly received video packets invalid, but also deteriorate the quality of service in wireless environment especially in ad hoc networks [6]. Therefore, it is necessary to make compressed video stream more resilient to errors at minimum expense of coding overhead.

In order to overcome the error propagation effect caused by video packet losses, long term memory motion-compensated prediction [7] is a reasonable way to suppress error propagation in the temporal domain at the cost of reducing the coding efficiency. In [8], the selection of reference frame in long-term motion compensated prediction is proposed for H.263 video with referring to the rate-distortion optimization (RDO) criteria. As a further work of [8], based on the original RDO model in error free condition, an error robust RDO (ER-RDO) method has been proposed in [9] for H.264 video in packet lost environment by redefining the Lagrange parameter and error-prone RD model. However, the ER-RDO method still requires a very high computational complexity to accurately determine the expected decoder distortion. To reduce the computational burden, Zhang et al. [10] developed a simplified version of the ER-RDO method by making full use of block-based distortion map to estimate the end to end distortion. Since the selected Lagrange parameters in these two methods are not precise enough to make corresponding rate distortion optimization, their cost for coding overhead for real time wireless video communication system is not desirable.

In the periodic frame method [11], a periodic frame is only predicted by previous lreference video frame, which is the previous periodic frame. lis the frame interval between neighboring periodic frames. When the frames between two periodic frames are lost, second periodic frame is still decoded correctly, so error propagation can be suppressed efficiently. However, the coding overhead of periodic frame increases obviously when the correlation between neighboring periodic frames is not high. To alleviate the heavy burden on wireless channel resulted by periodic frame, Zheng et al. also proposed the periodic macroblock (PMB) method [11] to reduce the increase of coding overhead by selecting only certain number of important MBs to be predicted by previous lreference video frame. PMB can effectively control the coding overhead with the sacrifice of the error reconstruction effect. Another effective way to constrain error propagation is to insert intracoded MBs. Compared to long term reference frame prediction, it needs more redundancy by adopting the intracoded mode. To obtain a better trade-off between the coding efficiency and error resilient performance, the methods based on accurate block-based distortion estimation model [12] [13] were developed for MPEG4 and H.261/3. The end-to-end approach in [12] generalized the RD optimized mode selection for point-to-point video communication by taking into account both the packet loss and the receiver's concealment method. In [13], the encoder computes an optimal estimate of the total distortion at decoder for a given rate, packet loss condition, and the concealment method. The distortion estimation is then incorporated within an RD framework to optimally select the coding mode for each macroblock. Both methods achieved better error resilient performance. However, their computational complexity and implementation cost are too high.

In this chapter, we develop a new channel based rate distortion (RD) model for error resilient H.264 video codec, which aims at minimizing the coding overhead increase while maintaining a good error resilience performance. In the new RD model, the practical channel conditions like packet lost rate (PLR) and packet lost burst length (PLBL), error propagation and error concealment effects in different reference frames are taken into consideration in analyzing the expected MB-based distortion at encoder. Moreover, for each reference frame, its corresponding Lagrange parameter is adjusted according to the variation of the channel based RD model, which can more accurately describe the relationship between coding rate and expected distortion at decoder in the sense of packet lost environment than other existing methods. Moreover, in our proposed new RD model, a proper intra-coded mode for error resilient performance is also considered. Therefore, more appropriate reference frame and encoding mode can be selected for each MB with the proposed method.

In the following of this chapter, a brief review on the error-robust rate-distortion optimization (ER-RDO) method is given in Section 2. The derivation of our proposed error resilient rate distortion (RD) optimization will be described in the same section. In section 3, the error resilient performance of the proposed method and some existing methods will be evaluated using computer simulations on H.264 video codec. Finally, some concluding remarks will be given in Section 4.


2. The proposed error resilience optimization method

As the latest video coding standard, H.264 has supreme coding performance by adopting lots of advanced techniques [14]. With the rate distortion optimization (RDO) operation, H.264 achieves a very good coding efficiency and a high PSNR simultaneously in error free condition. For encoding mthMB in nthframe, the RDO operation can find its most proper coding mode and reference frame by minimizing the cost as follows:


where Ds(n,m,r,o) and R(n,m,r,o) are the source distortion and the coding rate when the MB is predicted by rthreference frame and encoded with mode o. In an error free environment, the Lagrange parameter can be determined by the quantization parameter Qas follows [15]


However, the cost in (1) doesn’t consider the distortion caused by error propagation and error concealment. Therefore, it cannot be directly used for finding the best reference frame and encoding mode in an error prone wireless packet-switched network if the channel condition is taken into consideration.

2.1. ER-RDO model

To take into account the packet lost effect, an error robust RDO (ER-RDO) method was developed in [11] by redefining the Lagrange parameter and error-prone RD model based on the practical wireless channel situation and potential decoded MB corrupted distortion. In the ER-RDO model, the expected overall distortion of mthMB in nthframe is determined as


where Decis the error concealment distortion if this MB is lost, and Deprepresents the expected error propagation distortion in the case that this MB is received correctly but the reference frames are erroneous. pis the current wireless channel packet loss rate (PLR), and pcis the probability that all reference frames are correct, which is computed by


where kis the number of reference frames in the encoder buffer.

If we assume high-resolution quantization, the source distortion Dsdepends on the rate (R) as follows [9]:


where α and βparameterize the functional relationship between rate and distortion [13]. If uniform quantization is used, then we have


where Δ is the quantization step size.

Referring to (5) and (6), the selected Lagrange parameter in ER-RDO model is computed as


WithλERRDO, (3) and (4), the best reference frame r*and encoding mode o*for mthMB in nthframe selected as in [11] are determined as follows.


From (4) and (7), we can find that the selected λERRDOin each reference frame is identical when the number of reference frame and PLR is known. That is to say, the correlation between the coding rate and expected overall distortion for all reference frames is equal. However, as we know, when the distance between the selected reference frame and the present encoding frame turns to be longer, the probability of correct reconstruction of this frame at receiver is higher with the degradation of the coding efficiency. Therefore, the term (1pc)Depin (8) is not accurate enough (a comprehensive interpretation will be given in next subsection). So in the sense of error resilience, the correlation between the coding rate and expected overall distortion for each reference frame at decoder should be different and be varied according to not only PLR and the range of reference frame, but also the distance between the selected reference predicted frame and the present encoding frame.

2.2. The proposed channel based RDO model

To overcome the problems of the ER-RDO model for H.264 video, we propose a new channel based RDO model to more accurately trade-off the coding efficiency and error resilient performance. For nthvideo frame to be encoded, there are kreference frames in encoder buffer, namely n-1, n-2… n-k, as illustrated in Fig.2.

Figure 2.

Inter-coded prediction reference frame range

The estimated cost for nthframe predicted by n-r(1rk) reference frame is


where R(n,n-r) is the coding overhead of nth frame predicted by n-r reference frame, and referring to (3),Dp(n,n-r) is the expected overall distortion of nth frame at decoder in the proposed channel based RDO model with n-r reference frame. It is given by


where Ds(n,n-r) is the source distortion predicted by n-r reference frame in error free situation, Dlepis distortion caused by the long term error propagation when frames before reference frames are lost. And Depris the potential distortion caused by the frame loss in the range of reference frame when n-rframe is the reference frame, which can be computed as followed.


For computing Depras in (11), it includes two parts: one is the error propagation distortion caused by n-k, n-k+1n-rreference frame. The term Drj+1in (11) is error concealment reconstruction distortion when n-jframe is lost (r ≤ j ≤ k), and its corresponding occurrence probability is


When the frames after present reference frame n-rare lost, present encoding frame ncan still be decoded correctly, this occurrence probability is computed as


So another part is the multiplying results of qs(r) and Ds(n,n-r) as in (11).

With (9), (10) and (11), the final estimate cost for nthframe predicted by n-rreference frame is


Finally, Jp(n,n-r) is computed as


So with the derivatives of Jp(n,n-r) for Δ as (7), the optimized Lagrange parameter for present encoding frame n predicted by reference frame n-r is obtained by


where we assume that the buffer length of reference frame kis larger than real-time PLBL obtained from the feedback of wireless channel situation.

2.3. Implementation of reference frame and mode selection algorithm

With the results obtained before, we apply the proposed channel based RDO model to select the best reference frame and encoding mode in an H.264 encoder as follows. For one MB in P frame, it has two categories of encoding modes: intracoded and intercoded. Intracoded modes include direct coding, intra_4×4and intra_16×16; intercoded modes include inter_16×16, inter_16×8, inter_8×16and inter_P8×8mode (this mode is composed of inter_8×8, inter_8×4, inter_4×8and inter_4×4 sub 8×8block modes). For each intercoded mode, the best reference predicted frame r*for mthMB in nthframe in coding mode ois selected by finding the minimum cost of interceded modeJp(n,m,o,r).


where r*is best reference predicted frame in coding mode o. Since (1p)k+1Dlepis same for any reference frame to predict nth frame, andDecis independent of encoding modes and reference frame [8], (17) can be simplified to



And then, the best encoding mode o*in intercoded mode is


For the best intracoded mode o**for this MB, it can be determined as follows with the cost for intracoded modeJi(n,m,o,0).


As the final results, the best encoding mode o^and its potential best reference predicted frame r^in the sense of optimized error resilience for mthMB in nthframe are found as


3. Experimental results

In this section, we evaluate the performance of the proposed channel based RDO model in terms of video quality and coding efficiency in wireless packet lost environment. In our experiments, we use H.264 JM 8.2 codec as test platform where video stream structure is selected as IPPP…. Three standard QCIF video sequences, namely Salesman, Susie and Foreman, are used in the simulations. The range of tested intracoded frames in these sequences is from 10th to 100th frame. Their QP is set as 28, their frame rate in H.264 JM8.2 is 30 fps, and their buffer of reference frames includes previous five frames. In order to make full use of wireless channel bandwidth, each compressed video frame is transmitted by a single packet. A simple error concealment method is used to make analysis of potential error propagation and error concealment effect at video encoder. When a MB is assumed to be lost, it will be replaced by the MB at same position in the previous error free frame. As a comparison, we use the original H.264 JM8.2 codec, the periodic frame method, the PMB method [11] and ER-RDO method [8] as reference algorithms. In addition, for the PMB method, we use PMB (11%), and PMB (22%) and PMB (33%) to denote the corresponding performance when the proportions of periodic MB in video frame are 11%, 22% and 33% respectively.

We first look at the error resilience performance of the proposed method by considering the PSNR performance of the reconstructed video under a packet loss environment. Fig.3 shows the error reconstruction effect of three test sequences using different methods when PLR = 0.1 and PLBL < 5. At each point in Fig.3, it is an average PSNR result when any reference frame of present encoding frame is lost.

It is shown in Fig.3 that the proposed method always achieves the best reconstruction effect for the three test sequences when compared with other methods. In Fig.3 (a), for Salesman sequence with low motion scene, the proposed method outperforms H.264 JM8.2, PMB (11%), PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 1.18dB, 1.14dB, 1.04dB, 0.8dB and 0.2dB, respectively. In Fig.3 (b), for Susie sequence with moderate motion scene, the proposed method performs better than H.264 JM8.2, PMB (11%), and PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 2.48dB, 2.03dB, 1.43dB, 0.13db and 0.21dB, respectively. In Fig.3 (c), for Foreman sequence with high motion scene, the proposed method achieves better results than H.264 JM8.2, PMB (11%), and PMB (22%), PMB (33%) and ER-RDO with an average PSNR improvement of 3.61dB, 3.04dB, 2.45dB, 1.72db and 0.53dB, respectively. As a conclusion, the proposed method can achieve more robust error resilient performance in different video scenes.

For evaluating the coding efficiency of different methods, we consider their impacts on overall coding rate requirement and PSNR performance of reconstructed video in error free environment. The simulation results for the three test sequences are listed in Table 1, 2 and 3 respectively. It is seen that all of the error resilient methods have little effect on original video quality. For fair comparisons, the PSNR performance of the reconstructed video is more or less kept constant for different methods. We then compare the coding rate required for each method.

MethodPSNR-Y (dB)PSNR-U (dB)PSNR-V (dB)Bit rate (kb/s)Increase (%)
H.264 JM 8.235.5739.640.1456.830%
Periodic Frame FFFrame35.5439.6140.1960.085.72%
PMB (33%)35.5439.5940.1557.481.14%
PMB (22%)35.5939.6140.1757.050.39%
PMB (11%)35.5939.5940.1757.030.35%
The proposed method35.5739.640.1557.140.54%

Table 1.

Coding rate comparison of different methods in Salesmansequence

Figure 3.

Reconstruction effect comparison of different methods when PLR = 0.1 and PLBL < 5

Table 1 shows the coding rate requirement of different methods for Salesmansequence, in which there is high correlation between reference frames and encoding frame. It is noted that the coding redundancy resulted in all methods is smallest among the three test sequences. The coding rate increase of ER-RDO method is not desirable as it needs more bits than the periodic frame method, while the PMB method in different level of long term predicted MB can obtain less rate increase. The coding overhead increase of the proposed method is not obvious as it is only slightly larger than PMB (11%) and PMB (22%) and apparently smaller than PMB (33%).

MethodPSNR-Y (dB)PSNR-U (dB)PSNR-V (dB)Bit rate (kb/s)Increase (%)
H.264 JM 8.237.2643.5443.2895.560%
Periodic Frame37.2443.5943.37108.9113.97%
PMB (33%)37.2343.5943.29102.417.17%
PMB (22%)37.2643.6243.2799.764.39%
PMB (11%)37.2543.5543.3297.782.32%
The proposed method37.2643.5643.2495.750.2%

Table 2.

Coding rate comparison of different methods in Susiesequence

For Susiesequence where the correlation between reference frames and encoding frame is moderate, the coding overhead is in general more than that of Salesmansequence, as shown in Table 2. It is noted that the coding rate of the periodic frame method has increased about 14%, which is a heavy burden for wireless channel. The coding rate increase of ER-RDO is smaller than PMB (33%), while it is still more than PMB (11%) and PMB (22%). The coding rate of the proposed method is just 0.2% higher than that of H.264 JM 8.2 but smaller than all other methods.

MethodPSNR-Y (dB)PSNR-U (dB)PSNR-V (dB)Bit rate (kb/s)Increase (%)
H.264 JM 8.235.7239.0440.72109.170%
Periodic Frame35.7439.1740.84129.0718.23%
PMB (33%)35.6939.1440.78116.416.63%
PMB (22%)35.739.0640.7113.794.23%
PMB (11%)35.7139.0340.75112.463.01%
The proposed method35.7239.0540.76111.281.93%

Table 3.

Coding rate comparison of different methods in Foremansequence

For Foremansequence, as there is low correlation between reference frames and encoding frame, the required coding rate of all methods is largest in among the three test sequences, as shown in Table 3. Again, our proposed method achieves the best coding efficiency. The coding rate increase of the proposed method is only 1.93%, while that of PMB (11%), PMB (22%), PMB (33%), ER-RDO and the periodic frame method is 3.01%, 4.23%, 6.63%, 6.82% and 18.23%, respectively.

As a conclusion with the results of error resilient performance and the coding efficiency, the proposed method can obtain not only more satisfying video reconstruction effect but also smaller coding rate increase than the reference methods.

Figure 4.

The coding rate (kb/s) of the proposed method with respect to original H.264 JM 8.2 codec in different PLR from 0.01% to 0.1%

Fig.4 shows the coding efficiency of the proposed method in different PLR from 0.01% to 0.1% of Foreman sequence. In Fig.4, we can find that the increase of coding rate using the proposed method is small when compared with that of H.264 JM 8.2 codec. Even in some instances of low LPR of Fig.4, the proposed method can achieve a slightly smaller coding rate than the original H.264 JM 8.2 codec.

As a further analysis on error resilient performance of the proposed method with respect to the PMB and ER-RDO method, Table 4, 5 and 6 give more detailed reconstruction PSNR (dB) effect comparison in Salesman, Susie and Foreman sequences when each of the reference frames in encoder buffer is lost. From the tables, we can find that the PMB method, especially PMB (33%) can achieve better results when the lost reference frame is far away from present encoding frame. On the contrary, ER-RDO can obtain better reconstruction effect when lost reference frame is near to present encoding frame. Our proposed method achieves a compromise between the two methods and obtains better average error reconstruction performance. In addition, it is always better than H.264 JM 8.2 when any reference frame in the encoder buffer is lost.

Lost reference frameProposed methodPMB (33%)PMB (22%)PMB (11%)ER-RDOH.264 JM8.2

Table 4.

Reconstruction PSNR (dB) comparison of different methods in Salesmansequence

Lost reference frameProposed methodPMB (33%)PMB (22%)PMB (11%)ER-RDOH.264 JM8.2

Table 5.

Reconstruction PSNR (dB) comparison of different methods in Susie sequence

Lost reference frameProposed methodPMB (33%)PMB (22%)PMB (11%)ER-RDOH.264 JM8.2

Table 6.

Reconstruction PSNR (dB) comparison of different methods in Foremansequence

4. Conclusions

In this paper, an error resilient method based on the feedback of wireless channel condition is proposed for robust H.264 video stream transmitted in wireless packet lost environment. The proposed method can smartly adjust Lagrange parameter for each reference frame at encoder buffer by adopting proposed channel based RDO model. The modified Lagrange parameter can better reflect the association between the expected distortion and coding efficiency of video streaming in the sense of error resilience in packet lost environments. Comprehensive experimental results show that the proposed method sufficiently absorbs the advantages of existing methods and achieves better error resilient performance with minimum increase of coding overhead.


The work of J. Feng was supported by the Hong Kong Baptist University under Grant Number RG2/09-10/080. The work of K.-T.Lo was supported by the Hong Kong Polytechnic University under Grant Number G-YH58 and G-YJ29.

© 2013 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Jian Feng, Yu Chen, Kwok-Tung Lo and Xu-Dong Zhang (January 9th 2013). Error Resilient H.264 Video Encoder with Lagrange Multiplier Optimization Based on Channel Situation, Advanced Video Coding for Next-Generation Multimedia Services, Yo-Sung Ho, IntechOpen, DOI: 10.5772/53149. Available from:

chapter statistics

1580total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Advanced Video Coding for Next-Generation Multimedia Services

Edited by Yo-Sung Ho

Next chapter

Optimal Bit-Allocation for Wavelet Scalable Video Coding with User Preference

By Guan-Ju Peng and Wen-Liang Hwang

Related Book

First chapter

A Survey of Image Segmentation by the Classical Method and Resonance Algorithm

By Fengzhi Dai, Masanori Sugisaka and Baolong Zhang

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us