Open access peer-reviewed chapter

Multiple Descriptions Coinciding Lattice Vector Quantizer for H. 264/AVC and Motion JPEG2000

By Ehsan Akhtarkavan and M. F. M. Salleh

Submitted: April 25th 2012Reviewed: October 12th 2012Published: January 9th 2013

DOI: 10.5772/54296

Downloaded: 1341

1. Introduction

Recent advances in high-performance portable processing equipment, such as mobile processors, have enabled users to experience new-generation devices, including networked gaming consuls, smart televisions and smart phones. Video coding, video compression and video communication are essential parts of the aforementioned applications. However, networking infrastructures do not offer unlimited bandwidth, and storage devices do not offer unlimited capacities. Therefore, there is significant demand for reliable high-performance video communication/compression protocols. Video compression refers to the process of reducing the amount of video data used to represent digital videos; it is a combination of spatial image compression and temporal motion compensation (Hanzo et al., 2007).

Multiple description (MD) coding has appeared to be an attractive technique to decrease the impact of network failures and increase the robustness of multimedia communications (Goyal, 2001). The MD coding is especially useful for those applications in which retransmission is not possible or is too expensive. Lattice vector quantization (LVQ) is a well-known lossy compression technique for data compression. LVQ is used for spatial compression and is less computationally complex due to the regular structure of the lattice ( Conway & Sloane, 1988). MD image coding has been presented in several studied (Bai & Zhoa, 2007) (Akhtarkavan & Salleh, 2010) (Akhtarkavan & Salleh, 2012).

In (Reibman et al., 2002), an MD video coder is presented that uses motion-compensated predictions. This MD video coder utilizes MD transform coding and three separate prediction paths at the encoder. Another MD video coding technique is introduced in (Biswas et al., 2008). In this scheme, the 3D Set Partitioning in a hierarchical tree (3D-SPIHT) algorithm is used to modify the traditional tree structure. Multiple description video coding based on LVQ was presented in (Bai & Zhao, 2006). In that study, MDLVQ is combined with the wavelet transform to produce a robust video coder. An error-resilient video coding scheme using the MD technique is proposed in (Chen, 2008) that employs MDLVQ with channel optimization (MDLVQ-CO). In that study, the central codebook is locally trained according to known channel erasure statistics, and two translated lattices are used for side codebooks to reduce distortion in the case of erasure.

In (Yongdong & Deng, 2003), MD video coding has been used to form an authentication scheme for Motion JPEG2000 (Fukuhara & Singer, 2002) streaming in a lossy network. In this study, the video frames are first transcoded into media fragments that are protected with integrity tokens and a digital signature. Then, the integrity tokens and signature are encoded into codewords using the forward error correction (FEC) for data loss resilience. Because of the use of MD video coding and sequence numbers, the scheme provides content integrity and defeats collage attacks.

In (Franchi et al., 2005), two MD video coding schemes are proposed, i.e., drift-compensation multiple description video coder (DC-MDVC) and an independent flow multiple description video coder (IF-MDVC). An interesting feature of DC-MDVC is the ability to use the actual reconstructed frame as the new reference frame for the side prediction loops instead of the original frame.

In (Zandoná et al., 2005), a second-order predictor is inserted in the encoder prediction loop. In (Campana, et al., 2008), a new MD video coding scheme for the H.264/AVC standard (Wiegand et al., 2003) is proposed based on multiple description scalar quantization. In that study, a splitting block is inserted in the standard H.264/AVC scheme after quantization. This block generates two descriptions by duplicating the control structures, picture parameter set, slice header, and information about the motion vectors. The MD coding video scheme presented in (Radulovic et al., 2010) splits the video information into several encoding threads, and redundant pictures are inserted to reduce the error drift packet loss that occurs. In (Zandoná et al., 2005), a novel RD model for H.264 video encoding in a packet loss environment is proposed. In that study, the end-to-end distortion is estimated by inserting a block-based distortion map to store the potential errors of the current frame that may propagate to the future frames. This scheme is intended for low-delay applications over lossy networks.

The IF-MDVC is designed for transmission in environments in which the packet loss rate is directly proportional to the packet size and thus having more descriptions with a smaller packet size at the same bit-rate is advantageous (Franchi et al., 2005). The MD coding scheme in (Radulovic et al., 2010) requires the channel conditions to allocate the coding rate to primary and redundant pictures, to minimize the total distortion experienced at the receiver. The study described in (Zhang et al., 2004) requires that the potential channel distortions of its reference frames be known a priori. The descriptions generated by the MD scheme presented in (Tillo et al., 2008) based on redundant H.264 pictures are not independent; thus the decoder cannot take advantage of all of the available information.

In this chapter a generic MD video coding based on the coinciding similar sublattices of the hexagonal lattice is presented. The coinciding similar sublattices are special sublattices because they have the same index; even though they are generated by different generator matrices (Akhtarkavan & Salleh, 2012). The proposed multiple descriptions coinciding lattice vector quantization (MDCLVQ) video coding scheme forms a diversity system for MD video coding that can be exploited for any video coding standard such as H.264/AVC or Motion JPEG2000. Extensive simulations and the experimental results of applying the proposed MD coding scheme to several reference QCIF video sequences demonstrate that our MD coding algorithm outperforms state-of-the-art single- and two-description video coding schemes in terms of the compression ratio and transmission robustness. In addition, small differences between the peak signal to noise ratio (PSNR) of the side video sequences compared to the central decoder show significant capability to resist channel failures. Finally, the proposed MD video coding scheme provides acceptable fidelity criteria in terms of PSNR, which means that there is a negligible drop in the quality of the video and a significant increase in error resiliency and compression efficiency.

2. Background

2.1. Multiple-description coding

Multiple-description (MD) coding is a method to address network impairments when the re-transmission is expensive or impossible. According to Goyal (2001), MD coding can effectively address packet loss without the need for retransmission, thus it can meet the network requirements. In this scheme, a stream of input data is transformed into several different independent descriptions and sent over different channels of a diversity system. At the receiver if all the descriptions are received correctly, the original data will be reconstructed accurately. But, in case some of the descriptions fail to reach the destination, due to channel failure, the rest of the descriptions, which are fed via side decoders, are used to find an estimate of the original data. The performance of the MD system to reconstruct the original data can be of several levels of accuracy. A typical 2-channel MD coding scheme is shown in Fig. 1. Consider the simplest scheme in which two descriptions are the same. If either description is lost then the other would be useful. However, if both descriptions are available then one will be useless and hence the bandwidth has been wasted. In other word, receiving more descriptions must result in better reconstruction quality which can be offered by the MD coding (Goyal, 2001). According to information theoretic approach, the MD coding scheme may not require more bit rate (bandwidth) than the single description system. In the MD coding system, there is always a trade-off between the required bit rate and the distortion. Thus, in MD coding scheme, compression efficiency is sacrificed in order to gain error resiliency. Therefore, MD coding should be applied only if it does not require too extra bit rate or a wider bandwidth (Goyal, 2001).

Figure 1.

A general scheme of the MD coding scheme.

2.2. Elementary of lattices

In mathematics (algebra), a lattice is defined as a partially ordered set (poset) in which any two elements have a unique supremum (the element’s least upper bound or join) and an infimum (greatest lower bound or meet). In other words, a lattice is considered as a subset of points in the Euclidean space that share a common property. For example the lattice Anis a subset of points with n+1 coordinates, such that the sum of these coordinates is zero. Therefore, the lattice Ancan be defined as:

An=x0,x1,,xnZn+1:x0+x1++xn=0E1

An n-dimensional lattice Λ in   Rnis denoted by  Λ=b1,b2,,bn. It means that Λconsists of all integer linear combinations of a basis vectors b1,b2,,bnin Rn(Heuer, 2008). The fundamental parallelotope of a lattice  Λ is defined as (Conway & Sloane, 1988)

θ1b1+θ2b2++θnbn 0θi<1E2

The fundamental parallelotope is the building block of the lattice because if it is repeated many times, the whole space is filled in a way that there is only one lattice point in each parallelotope. The lattice points are generated using a generator matrix. The generator matrix of the lattice Λwith the basis vectors b1=b11,b12,,b1m, b2=b21,b22,,b2m ,, bn=bn1,bn2,,bnm is given as (Conway & Sloane, 1988):

G=b11b12b1mb21b22b2mbn1bn2bnmE3

The Gramm matrix of a lattice Λis defined as  A=GGt, where Gtis the transposed matrix of G. The Gramm matrix determines the linear independence of the basis vectors, that is, they are linearly independent if and only if the determinant of the Gram matrix is non-zero. Two lattices are called equivalent if they have the same Gramm matrix or if the Gramm matrices are proportionate. There are many ways of choosing a basis and a fundamental parallelotope for a lattice  Λ. But the volume of the fundamental region is uniquely determined by  Λ, and the square of this volume is called the determinant of the lattice (Conway & Sloane, 1988). The determinant of a lattice Λis also equal to the determinant of the Gramm matrix (Conway & Sloane, 1988):

detΛ=detA=detG2E4

Thus, the volume of the fundamental parallelotope of a lattice Λis calculated as (Conway & Sloane, 1988)

vol=detG=detΛ=detAE5

The hexagonal lattice is a subset of the complex space  C, and at unit scale it is generated by the basis vectors  {1, ω}C, where  ω=-1/2+i3/2(Vaishampayan et al., 2001). The hexagonal lattice is generated by

G2×2=Re(1)Im(1)Re(ω)Im(ω)=10-1232E6

and the Gramm matrix of G2×2is calculated as (Conway & Sloane, 1988)

A2x2=G2×2G2×2t=122-1-12E7

and  detΛ2×2=detA2×2=34. Thus, the volume (or area) of fundamental parallelotope of Λwill be calculated as  vol=detΛ2×2=32 .The hexagonal lattice is also generated by

G2×3=1-1001-1E8

and the Gramm matrix of G2×3can be calculated as

A2×3=G2×3G2×3t=2-1-12=2A2×2E9

According to  A2×3=2A2×2, the lattices generated by G2×2 and  G2×3 are equivalent lattices. However, the determinant of   Λ2×3is 3 and the volume of fundamental parallelotope of Λ2×3is  3. This is because G2×2and G2×3both describe the hexagonal lattice but in different coordinates and on different scales (Conway & Sloane, 1988). In an n-dimensional lattice  Λ, the Voronoi region of a lattice point is defined as the union of all non-lattice points within Rn that are closer to this particular lattice point than any other lattice point. Thus, the Voronoi region of   λΛ  is defined as (Vaishampayan et al., 2001)

VλxRn :x-λx-λ' ,λ'ΛE10

As a consequence, all the points within Vλmust be quantized to  λ. The Voronoi regions of the points in the A2 are hexagons; therefore, it is called the hexagonal lattice. The Voronoi region of a sublattice point λ'is the set of all lattice points that are closer to λ'than any other sublattice points. Thus, the Voronoi region of  λ'Λ'is defined as

Vλ' λΛ :λ-λ'λ-λ" ,λ"Λ'E11

Lattice vector quantization (LVQ) is a vector quantization technique that reduces the amount of computation for codebook generation since the lattices have regular structures. A finite set of points y1, , yMin an n-dimensional Euclidean space,  Rn , is called an Euclidean code (Conway & Sloane, 1982a). An n-dimensional quantizer is a mapping function Q :RnRnthat sends each point  x ϵ Rninto  Qxprovided that  Qxis the nearest code point. The code points may be selected according to any type of relationship. If the code points are selected from a lattice, then the quantizer would be called a lattice vector quantizer. Fast quantizing algorithms are a family of lattice vector quantization algorithms presented in (Conway & Sloane, 1982b) for different root lattices. The quantization using Anlattice points is a projection from n-dimensional space onto i=1n+1xi=0, xiZhyper plane. The fast quantizing algorithm first projects the n-dimensional input vector onto n+1 dimensional vectors on i=1n+1xi=0, xiRhyper plane using a matrix multiplication (Conway & Sloane, 1982b). Then, using a manipulation the projected point is mapped onto a lattice point. In case of A2lattice, the input stream of data is vectorized into 2-dimensional vectors. Then, each input vector (i1,i2)is projected onto the 3-dimensional space,  x0, x1,x2 Z3using the transformation matrix T given as (Conway & Sloane, 1982b)

 T=10-11/3-2/31/3E12

If the expression x0+x1+x2=0does not hold, all the coordinates need to be rounded to the nearest integer points, while keeping the original values in another variable. The projected 3-dimensional vector is easily quantized (mapped) to the nearest lattice point by a simple manipulation. The sum of the differences between each coordinate of the original projected point to the nearest integer is calculated. If the sum of the differences is positive, then 1 is subtracted from the coordinate farthest from the integer. On the other hand, if the sum is negative, then 1 is added to the coordinate with the most difference. Thus, performing the computation-intensive nearest neighboring search algorithm is avoided. The two-dimensional version of the result point is calculated by right multiplying the quantized point by 12Tt(Conway & Sloane, 1982b).

2.3. Coinciding similar sublattices of A2

Assume that   Λ  is an n-dimensional lattice with the generator matrix  G. A sublattice  Λ'Λ with generator matrix  G'is said to be geometrically similar to Λif and only if  G'=cUGB, for nonzero scalar  c, an integer matrix U with  detU=±1, and a real orthogonal matrix B (with BBt=I) (Conway & Sloane, 1988). The index N is defined as the ratio of the fundamental volume of the sublattice Λ'to the fundamental volume of the lattice  Λ. Therefore, the value of N controls the coarse degree of the sublattice as well as the amount of redundancy in the MD coder (Vaishampayan et al., 2001).Thus, N is calculated by

N=vol'vol=detΛ'detΛ=detG'detGE13

Sublattice  Λ'Λis considered as a clean sublattice if all the points of the  Λreside only inside the Voronoi region of the sublattice points rather than on the boundary of the Voronoi region (Conway et al., 1999). It has been shown in (Bernstein et al., 1997) and (Vaishampayan et al., 2001) that, for the hexagonal lattice, Λ'is similar to Λif N is of the form

  N=α2-αβ+β2 for α, βZE14

In addition, N must be in the form of  N=i=0Kni, where, ni denotes the number of points at squared distance ifrom the origin. The sublattices of A2 are clean, if and only if, α and β are relatively primes. It follows that A2 has a clean similar sublattice of index N if and only if N is a product of primes congruent to 1 (mod 6) (Conway et al., 1999). The sequence of integers that generate clean sublattices of the hexagonal lattice are named A038590 by Sloane (Sloane, 2000). If these conditions are met then the basis vectors of the sublattice Λ'will be u=α+βω and  v=α+βωω. In other words, αand β are selected such that the value of N satisfies these conditions and hence a clean similar sublattice of the hexagonal is generated. Thus, the basis vectors are calculated as u=α+βω =α-β2+β32i and  v=α+βωω=-12α+β+32(α-β)i. The corresponding generator matrix will be

Gi'=Re(u)Im(u)Re(v)Im(v)E15

For example, with α=-3and β=2, a clean similar sublattice of the hexagonal lattice with index  N=(-3)2--32+22=19is generated. The basis vectors are calculated as u=-3+2ω=-4+i3 and  v=-4+i3ω=0.5-2.5i3. Thus, the corresponding generator matrix will be calculated as

Gi'=Re(u)Im(u)Re(v)Im(v)=α-β2β32-α+β23 (α-β)2=-430.5-2.53E16

Figure 2.

The geometrically similar sublattice of A2 with index N=19 generated by Eq. (16).

It is also possible to calculate the index of the sublattice generated by Gi'using Eq. (13). The determinant of the generator of the hexagonal lattice at unit scale is  3/2. The determinant of G'is calculated as  detGi'=-4×-2.53-0.5×(3)=193/2. Thus, the index will be  (193/2)/(3/2)=19. The sublattice generated by G'is shown in Fig. 2 with blue squares and the hexagonal lattice points are shown with light blue triangles. The fundamental parallelotope of the hexagonal lattice and the similar sublattice generated by G'are shown with small and big parallelogram, respectively. The basis vectors, u and v, are also shown. The Voronoi region of the sublattice point λ'=(7.5, 0.53)is shown with a dashed hexagon.

It is seen in Fig. 2 that  G'has generated a clean similar sublattice because there are no lattice points on the boundary of the Voronoi region of the sublattice points. The coinciding similar sublattices are defined as geometrically similar sublattices of a root lattice with the same index N but generated by different generator matrices. The commonly used values of N are 7, 13, 19, and 37. Therefore, α and β must be selected such that clean sublattices of the hexagonal lattice with a desired index is generated. In order to find the suitable values of α and β, all twofold combinations of integers [-10, -10] … [+10, +10] have been examined and only the combinations that generate clean similar sublattices of the hexagonal lattice with index N = 7 are provided in Table 1.

The similar sublattices of the hexagonal with index N=7 corresponding to the generator matrices G1'to G12'provided in Table 1, are plotted in Fig. 3. The sublattice points corresponding to G1'are shown with blue squares and the sublattice points corresponding to G2'are shown with red circles. It can be observed that in the area that is shared between all the sublattices, only two distinct sublattices exist. In other words, the similar sublattices are coinciding with each other in a regular manner. Thus, these sublattices are called coinciding similar sublattices of the hexagonal lattice. In other words, the sublattices form two groups, the first group of sublattices coincide with G1'and the second group coincide with  G2'. Each group consists of 6 sublattices which are coinciding with each other. Another apparent property of the coinciding sublattices, G1'and  G2', is that they overlap with each other in a regular pattern, that is, they overlap with each other every N lattice points in every direction. These points are shown by big red circles with black border. The SuperVoronoi set of an overlapping point is the set of all lattice points that are closer to this point than any other overlapping point. The SuperVoronoi set of the overlapping points are shown with a blue hexagon. This symmetry is used in construction of the partitions. Partitions are used to define the new equivalence relation and the new shift-property that can be used to simplify the labeling function.

iαβciαβG i '
1-3-2G1'=-2.0-32.5-3/271-2G7'=2.0-30.533/2
2-3-1G2'=-2.5-3/22.0-3813G8'=-0.533/2-2.0-3
3-2-3G3'=-0.5-33/22.53/292-1G9'=2.5-3/2-0.533/2
4-21G4'=-2.53/20.5-33/21023G10'=0.533/2-2.5-3/2
5-1-3G5'=0.5-33/22.031131G11'=2.53/2-2.03
6-12G6'=-2.03-0.5-33/21232G12'=2.03-2.53/2

Table 1.

Different values of αand βfor different values of N=7.

As shown in Table 1 there are 12 generator matrices corresponding to every indices N = 7. The choice of the generator matrix changes the quantization process because the transformation matrix is calculated based on the generator matrix, that is, the choice of the generator matrix determines the transformation matrix used. The root lattice  An has two different definitions. One is in n-dimensional space, that is, with n×ndimensions. Another group of generator matrices are with  n×(n+1)dimensions. For example, A2 lattice has a 3-dimensional generator G2×3, in addition to the generators in 2-dimesional Gi'space. The transformation matrix is calculated using the relation

 Ti=Gi'-1×G2×3E17

By substituting  Gi',  G2×3, and  Tiin Eq. (3-7), the corresponding transformation matrix will be

 Ti=1(α23-αβ3+β23)3(α-β)-α3β3α+βα-2β-2α-βE18

For example, the transformation matrix corresponding to α = -3 and β = -2 is calculated as

 T1=13.53-0.531.53-3-2.50.52=-0.1430.429-0.286-0.4120.0820.330E19

Figure 3.

The coinciding similar sublattices of hexagonal lattice with index N=7 but in a limited range

3. MD video coding based on coinciding similar sublattices of A2

MD coding has appeared to be an attractive scheme to be used for representing fault tolerant communication schemes and lattice vector quantization has been known for representing high compression performance with low computational needs. A multiple-description lattice vector quantizer encodes the vectorized source for transmission over a two-channel communication system.

In this section, multiple-description lattice vector quantization based on the coinciding similar sublattice of A2 (hexagonal lattice) are presented. These schemes are called MDCLVQ-H.264/AVC and MDCLVQ-Motion JPEG2000. The experimental results will be presented in section 4.

3.1. System overview

A video is a sequence of two dimensional images coming with certain timing details. Therefore, it is possible to consider a given video as a three-dimensional array in which the first two dimensions serve as spatial directions of the moving pictures, and the third dimension represents the time domain. In this way, a frame is defined as a set of all pixels that correspond to a single moment.

Figure 4.

MDCLVQ video coding scheme applied to H.264/AVC and Motion JPEG2000.

In the proposed MD video coding schemes the input video is converted into two descriptions before being encoded by the standard video encoder. Then, the encoded descriptions are sent on the channels. If both descriptions reach the receiver, then a high quality video can be reconstructed by the central decoder. However, if only one description arrives, then a degraded video is reconstructed by the appropriate side decoder. The MDCLVQ is a generic scheme and it can be adopted for any video coding standard.

However, in this research it has been applied to H.264/AVC and Motion JPEG2000 only. Thus, two MD video coding schemes MDCLVQ-H.264/AVC and MDCLVQ-Motion JPEG2000 are defined, respectively. In other words, MDCLVQ-H.264/AVC uses the H.264/AVC video encoder while MDCLVQ-Motion JPEG2000 uses the Motion JPEG2000 video encoder. Therefore, both schemes are shown in a single schematic diagram in Fig. 4. The MDCLVQ video coding scheme is described in following subsections. In the proposed MDCLVQ scheme A2lattice points are used as the codebooks of the quantizer and the coinciding similar sublattice points are used as the labels. The schematic diagram of the MDCLVQ is illustrated in Fig. 6. The MDCLVQ scheme includes: the wavelet transformation module, the vectorization module, the LVQ module, the labeling function module, the arithmetic coder/decoder module, and the MD decoders. These modules are described in following subsections.

3.2. Wavelet decomposition module

It is possible to consider a video as a three dimensional array in which the first two dimensions serve as spatial directions of the moving pictures, and the third dimension represents the time domain. In this way, a frame is defined as a set of all pixels that correspond to a single moment. Every frame is decomposed into several wavelet coefficients (sub-bands), where the biorthogonal Cohen-Daubechies-Feauveau (CDF) 5/3 wavelet transforms (with lifting implementation) with 1 level of decomposition is used. Finally, the wavelet coefficients are streamed to the LVQ module.

3.3. Lattice vector quantizer module

In the LVQ module, the 2-D vectors, constructed in vectorization module are mapped to the nearest neighbouring lattice point using the fast quantizing algorithm. Lattice A2is used in the LVQ as the codebook. The details of the fast quantizing algorithm have been presented in section 2.2.

3.4. The coinciding labeling function

The proposed coinciding labeling function is composed of the hexagonal lattice  Λ, coinciding sublattice number 1,  Λ1'and coinciding sublattice number 2,  Λ2'. The first coinciding sublattice  is generated by  G1'; while the second coinciding sublattice is generated by  G2'. The labeling function maps each lattice point into two coinciding sublattice points that form a label. The first coinciding sublattice point belongs to  Λ1'; while the second coinciding sublattice point belongs to  Λ2'. The two produced descriptions are encoded using a basic zero-order arithmetic codec prior being transmitted over the channels.

There are a lot of similarities between the terminologies of MDCLVQ-A2 schemes and traditional MDLVQ schemes. The coinciding similar sublattices overlap with each other in a regular pattern, that is, they have overlapping points every N lattice point in every direction. These points comprise the set of overlapping point Sov. It is defined as

Sovλ: λΛ Λ1' Λ2' E20

where,  Λ is the lattice,  Λ1 'is the first coinciding similar sublattice, and  Λ2'is the second coinciding similar sublattice. In other words, the overlapping points are the lattice points on where the two coinciding similar sublattices coincide with each other. The Voronoi region/set of a coinciding sublattice is defined in Eq. (13). The SuperVoronoi region/set of an overlapping sublattice point λ'is the set of all lattice and sublattice points that are closer to λ'than any other overlapping point. In other words, the SuperVoronoi region Ssv of   λ' is defined as

Ssvλ'λΛ , λ'Sov:λ-λ'λ-λ" ,λ"SovE21

where Λis the lattice and Sovis the set of overlapping points. The coinciding similar sublattices partition the space into SuperVoronoi regions. The coinciding similar sublattices of the hexagonal lattice with index N = 7 and the SuperVoronoi regions of the overlapping sublattice points are plotted in Fig. 3. The coinciding similar sublattice points are within the partitions. Thus, the labels all reside within the same SuperVoronoi region as the labelled lattice points. Therefore, both vertices of the label will be as close as possible to the original lattice point and the side distortions will be low.

The labeling function maps every hexagonal lattice point λ to a label. In the MDCLVQ scheme, the label is considered as a directed edge  e=λ1',λ2'  which consists of two coinciding sublattice points  λ1 'and  λ2' selected from the coinciding similar sublattices Λ1' and  Λ2' respectively. The set of labels corresponding to every overlapping point  λovis defined as:

ελov=v1,v2:(Λ1'Ssvλov)×(Λ2'Ssvλov)} E22

In other words, ελovis the set of all possible labels within the SuperVoronoi set of the overlapping point  λov. The coinciding similar sublattices have inherent symmetries that can be used to define a new equivalence relation. In order to define the new equivalence relation, a modified version of the congruency relation is used. The traditional congruency relation  (mod k)accepts natural modulus; however, the modified version of the congruency accepts real modulus.

Let’s assume that λxand λydenote the x-coordinate and y-coordinate of the lattice point   λ, respectively. Although λxand λymay take on values from an unlimited range of real numbers, using (mod k), the remainders only take on values from a finite set of real numbers. It has been observed that according to the congruency relation the remainders are repeated in a regular pattern. For example, in case of the coinciding similar sublattice with index N the remainders of λx(mod N2)take on values 02, 12, 22, 32N-12and the remainders of λy(mod N32)take on values  02, 32, 232, 332(N-1)32. Thus, it is possible to define a new equivalence relation for the coinciding similar sublattice with index N. The equivalence relation enables the labeling function to find the SuperVoronoi sets of the sublattice points without using the computation intensive neighbourhood search. The new equivalence relation, denoted as  ', is defined as:

λ''λ" λx'λx"mod N2 λy'λy"mod N32E23

Eq. (23) shows that the two points of the lattice, λ'and  λ", are equivalent, if and only if their x-coordinates are congruent modulo  N/2 and their y-coordinates are congruent modulo  N3/2. Thus, the lattice points are divided into N2partitions. Each equivalence class is denoted by a twofold a, b as the following:

mx, my=mx=λxmod N2, my=λymodN32E24

The proposed labeling procedure has three properties. First, it utilizes different sublattices rather than a single sublattice. Second, it utilizes a new rule of equivalence as well as a new partitioning scheme. Third, the coinciding labeling scheme utilizes a new shift property in order to simplify the labeling procedure. The new shift property is defined as:

αλ~=αλ+λ~-λ if λ~ 'λ , λSsvOE25

This finding shows that in order to label an arbitrary lattice point, λ~,it is just enough to obtain its equivalent point within the SuperVoronoi set of the origin, and translate the label of the equivalent point by  λ~-λ. Thus, the problem of labeling all lattice points has been simplified to the problem of labeling only the lattice points within the SuperVoronoi set of the origin. These labels are obtained from the set of labels corresponding to the origin  εO.

The rule for assigning labels within εOto points within  SsvOis the parallelogram law. According to the parallelogram law if a lattice point λ is labelled using  e=(λ1' ,λ2'). Then, sum of the side distortions, λ-λ1'2+λ-λ2'2is equal to  12λ1'-λ2'2+2λ-λ1'+λ2'22, where λ1'+λ2'2is the midpoint of the label. Thus, in order to minimize the distortions, the algorithm chooses the shortest label that its mid point is as close as possible to the lattice point. The distance between the label and the lattice point is considered as the length of the line connecting the midpoint of the label to the lattice point. The labeling function is performed in the following nine steps;

  1. Generate the coinciding similar sublattices  Λ1' and  Λ2'

  2. Obtain the set of overlapping points  Sov

  3. Obtain the SuperVoronoi set of the origin , Ssv O

  4. Construct the partitions using the defined relation  '

  5. Obtain  εO, the set of labels corresponding to the origin

  6. Assign every lattice point within  Ssv O, to the shortest edges within  εO that its midpoint is as close as possible to  λ

  7. For any point outside the SuperVoronoi set of the origin,  λ~Ssv O, obtain its equivalent point within the SuperVoronoi set of the origin,  λSsv O

  8. Obtain the label of the equivalent point

  9. Translate the label of the equivalent point by  λ~-λ

As the two sublattices can be used interchangeably in the labeling algorithm, there is no preference between the two descriptions. Thus, there is no need to use the alternative transmission technique. As for example, consider the hexagonal lattice with unit fundamental area and the coinciding similar sublattices with index N=7shown in Fig. 5. The first sublattice points are shown with circles and the second sublattice points are shown with squares. The sublattice points within  Ssv O are labeled with numbers. In Fig. 5, the lattice points outside the SuperVoronoi set of the origin are shown with triangles and the overlapping points are shown with circled-squares.

Figure 5.

The labeling scheme for index N=7. The lattice, the first sublattice, and the second sublattice are shown with triangles, circles, and squares.

In general, for the sublattices of the hexagonal lattice with index N, there are N2+N-1points within the SuperVoronoi set of the origin. There are N2possible labels within  εOthat are obtained by the Cartesian product of Λ1'SsvO and  Λ2'SsvO. According to the new equivalence relation defined in Eq. (23), the partitions are constructed by dividing the x-coordinate of the lattice points modulo N/2and the y-coordinate of the lattice points modulo  N3/2. Thus, the lattice points are divided into N2 partitions. The partitions are divided into N2 single-point partitions and N-1double-point partitions. In case of index N=13, there are 181 lattice points inside the SuperVoronoi set of the origin. These points are also partitioned into N2=169single-point partitions and 12 double-point partitions. In case of the coinciding sublattice with index N=7 there are 55 lattice points within the SuperVoronoi set of the origin that are shown with crosses in Fig. 5. The points within the SuperVoronoi sets of the origin are divided into 49 partitions, including 43 single-point partitions and 6 double-point partitions. The double-point partitions are shown with capital letters in Fig. 5. The same partitioning scheme is repeated for every overlapping point due to the inherent symmetry of the coinciding similar sublattices. The constructed partitions are listed in Table 2.

λ( λ x , λ y )m x ,   m yλ( λ x , λ y )m x ,   m yλ( λ x , λ y )m x ,   m y
a(1,0)[1,0]n(1,3)[1, 3]AO-AE(2.5,1.53)[2.5,1.53]
b(0.5,0.53)[0.5,0.53]o(-1,3)[2.5, 3]AP-AF(-1,23)[2.5,23]
c(-0.5,0.53)[3.0,0.53]p(-2,0)[1.5,0]AK-AG(-3.5,0.53)[0,0.53]
d(-1,0)[2.5,0]q(-1,-3)[2.5,2.53]AL-AH(-2.5,-1.53)[1,23]
e(-0.5,-0.53)[3.0,33]r(1,-3)[1.0,2.53]AM-AI(1,-23)[1,1.53]
f(0.5,-0.53)[0.5,33]s(3,0)[3.0,0]AN-AJ(3.5,-0.53)[0,33]
g(1.5,0.53)[1.5,0.53]t(1.5,1.53)[1.5,1.53]aa(-3,-3)[0.5,2.53]
h(0,3)[0, 3]u(-1.5,1.53)[2.0,1.53]ab(0,-23)[0.0,1.53]
i(-1.5,0.53)[2,0.53]v(-3,0)[0.5,0]ac(3,-3)[3.0,2.53]
j(-1.5,-0.53)[2,33]w(-1.5,-1.53)[2.0,23]ad(3,3)[3.0,3]
k(0,-3)[0.0,2.53]x(1.5,-1.53)[1.5,23]6(2,-3)[2.0,2.53]
l(1.5,-0.53)[1.5,33]y(0,23)[0.0,23]7(2,3)[2.0,3]
m(2,0)[2.0,0]z(-3,3)[0.5, 3]8(-0.5,1.53)[3.0,1.53]
0(0,0)[0,0]3(-2,3)[1.5, 3]9(-2.5,0.53)[1,0.53]
1(2.5,0.53)[2.5,0.53]4(-2.5,-0.53)[1.0,33]11(0.5,-1.53)[0.5,23]
2(0.5,1.53)[0.5,1.53]5(-0.5,-1.53)[3.0,23]12(2.5,-0.53)[2.5,33]

Table 2.

The SuperVoronoi set of the origin and their partitions for N=7.

The points inside the double-point partitions are located on the boundaries of the SuperVoronoi regions of the overlapping sublattice points. The points residing on the borders are always equidistant from the two adjacent overlapping points. Thus, it is possible to label them according to two overlapping points. However the labeling function must be injective. Therefore, a simple rule is invented to the labeling function in order to avoid confusion of having double-point partitions. The invented rule must be to label the equidistant points using either the first overlapping point or the second overlapping point. In the current labeling function these points are labelled using the overlapping point with smaller x-coordinate.

Consider the points AG and AK in Fig. 7 that form a double-point partition; since xAGxAK mod N/2and  yAGyAK (mod N3/2). According to the new rule, AK is labelled due  λov=(0,0)and the label is  10, 0,90, 0. In contrast, AG is labelled due  λov=-7,0 and its label is  1-7, 0,9-7, 0. This rule is repeated for other double-point partitions. However, every single-point partition within the SuperVoronoi sets of the origin is labelled using the shortest edge within εOwhich its midpoint is as close as possible to the lattice point. The complete list of the labels is given in Table 3.

λeλeλeλeλeλeλe
O<0,0"/>f<1,10"/>l<1,11"/>r<5,12"/>y<2,8"/>6<6,0"/>AN-AJ<4,12"/>
a<2,11"/>g<2,12"/>m<6,7"/>s<1,12"/>z<3,9"/>7<0,7"/>AO-AE<5,7"/>
b<3,12"/>h<3,7"/>n<1,8"/>t<2,7"/>1<1,0"/>8<0,8"/>AL-AH<2,10"/>
c<4,7"/>i<4,8"/>o<2,9"/>u<3,8"/>2<2,0"/>9<0,9"/>AM-AI<3,11"/>
d<5,8"/>j<5,9"/>p<3,10"/>v<4,9"/>3<3,0"/>10<0,10"/>AP-AF<6,8"/>
e<6,9"/>k<6,10"/>q<4,11"/>x<6,11"/>4<4,0"/>11<0,11"/>AK-AG<1,9"/>
aa<4,10"/>ab<5,11"/>ac<6,12"/>ad<1,7"/>5<5,0"/>12<0,12"/>

Table 3.

Points inside SuperVoronoi set of the origin and their labels.

3.5. Inverse wavelet transforms

In this step the side descriptions are inverse wavelet transformed and the two different video streams are produced. The inverse wavelet module uses the biorthogonal inverse lifting (CDF) 5/3 wavelet transform. The experimental results related to the application of MDCLVQ-A2 to MD video coding are presented in section 4.

3.6. Video encoders/decoders

In the encoder modules, the input side video streams are encoded by H.264/AVC or Motion JPEG2000 encoders. In case the H.264/AVC is selected as the video encoding standard, then the JM reference encoder is used. On the other hand, the Motion JPEG 2000 is another choice of the video encoding module that is done using the software provided by OpenJPEG library. The bit streams are sent over the channels. In the receiver side, the bit streams are decoded using the corresponding decoders. The decoding in this step is almost lossless.

3.7. The decoders in the receiver side

There are two different decoding processes in this scheme. One is the lossy decoding that is done in the side decoders and central decoder. Another is the lossless decoding that is done in H.264/AVC (Motion JPEG2000) decoder. The latter is the reverse of the encoding process that is done in H.264/AVC (Motion JPEG2000) encoder and the former is the reverse of the labeling function.

The received video data streams are decoded by H.264/AVC (Motion JPEG2000) then sent to the central decoder and the side decoders. There are an 2m-1 of decoders in an m-channel MD coding schemes, that is, a central decoder and 2mside decoders (Ostergaard, 2007). In the MDCLVQ, a two channel MD coding scheme is used. Thus, there are three decoders: the central decoder, side decoder 1, and side decoder 2. At the receiver, if both descriptions are received correctly, then the central decoder is used to map the labels back to the original lattice points. The side decoders are used to find an approximation of the original lattice point if only one single description is received.

The central decoder is the reverse of the labeling process described above. When the two descriptions are received correctly, they are sent to the central decoder to be mapped back to the corresponding lattice point. Assume that the received label is  e'=λ1',λ2'. In order to decode  e'the following three steps are performed:

  1. Obtain  the equivalent label of  e'within  εO. Denote it as  e''=λ1'',λ2''. Two labels are said to be equivalent if their vertices are equivalent points; i.e.  λ1''λ1' and  λ2''λ2'.

  2. Obtain the lattice point   λSsv Ocorresponding to the label  e''.

  3. Translate λby  e''- e'to find the original lattice point.

In this scheme, if only the description from channel iis received while the other channel is corrupted, then λi' is used in the side decoder ito find the approximate of the original lattice point using the following steps:

  1. Obtain the equivalent sublattice point of λi'within SsvO and denote it as  λe~.

  2. Obtain the subset of labels that use λe~ as one of their coordinates by:

 εpλe~=e1',e2'ε0,e1'=λe~  e2'=λe~E26
  1. Obtain the midpoint of the lattice points within SsvO that their labels are members of  εpλe~. It is denoted as  λ~m.

  2. Obtain the approximate point  λ~ by translating λ~mby  λi'-λe~.

In other words, the equivalent label for the corrupted data is one of the labels within  εpλe~. There is no way to determine the original label as well as the original lattice point. Thus, it is required to obtain the approximate point by finding the weighted midpoint of the lattice points corresponding to the labels residing within  εpλe~. In order to find the approximate point, the weighted midpoint is translated b y λi'-λe~, using the shift property.

As for example, if the received descriptions are given as λ1'=9.5 , 0.866 and  λ2'=(9 , 1.7321). Since 9.5 is congruent to 2.5 (mod 7/2) and 0.866 is congruent to 0.53 (mod 73/2) . Therefore, λ1'belongs to the partition  [2.5, 0.53]. Similarly, it is possible to show that λ2'belongs to the partition  [2, 3]. Thus, the label e'=λ1',λ2' is equivalent to the label  e''=1,7according to Table 4. Both labels are shown with black arrows in Fig. 6. The necessary translation to shift e'to e''would be  t= e'- e''=(-7,0), according to the shift property. It is shown with a red arrow.

In the next step, the lattice point corresponding to  e'' is obtained from Table ‎3, that is  ad=(3,3) . The necessary shift to translate adto the original lattice point is also  t=(7,0). Therefore, the original lattice point is (10, 1.734). The decoding procedure is depicted in Fig. 6. However, assume that one of the descriptions is lost and therefore only one description is received correctly. As the labeling function is balanced, it does not matter which label is received. In case λ1'is received, it is processed by the side decoder 1 to find the approximate point. As showed before λ1'belongs to the partition  [2.5, 0.53]and its equivalent sublattice point that is residing within SsvOis given by  λe~=1.

Figure 6.

The decoding procedure performed by central decoder.

According to Table ‎4-2, the lost description may be within   εp1={0, 10,11,8,12,7,9}. Thus, the approximate equivalent point  λ~mwill be the midpoint of the lattice points within  {1, f,l,n,s,ad,AK-AG}. In order to find the approximate of the original point  λ~m is translated by  λ1'-λe~which obtains as result the point (9.8235, 0.7121). On the other hand, if λ2'is received then the side decoder 2 is used to process it. The coinciding sublattice points are used either as the first or the second vertex of the label. Therefore, at the receiver it is possible to determine the channel to which the received description belongs.

4. Experimental results for MDCLVQ-A2 applied to video coding

In this section, the experimental results of the MDCLVQ-A2 video coding schemes are presented. The proposed MDCLVQ-A2 is applied to H.264/AVC and Motion JPEG2000 schemes. Section 4.1 presents the experimental results for MDCLVQ-H.264/AVC and the results for MDCLVQ-Motion JPEG2000 are presented in section 4.2.

4.1. MDCLVQ-H.264/AVC

The standard videos of “Akiyo”, “Coastguard”, “Carphone”, and “Foreman” with the QCIF format (144×176 pixels) and 7.5 fps are chosen to evaluate the benefits of the proposed MDCLVQ-H.264/AVC scheme over the original single description scheme in terms of bit rate-efficiency and error-resiliency. The default settings of the original JM9.4 reference encoder are used for encoding the original videos and the side videos.

The MDCLVQ scheme is a multivariable system because it is controlled by three factors: the wavelet threshold, the fundamental area of the hexagonal lattice, and the index of the coinciding sub-lattices. The performance of the MD coding scheme is evaluated based on the average required bit rate and reconstruction fidelity in the decoders. The fundamental area of the lattice is changed by multiplication of a factor called sigma to transformation matrix. Increasing the fundamental area increases the required bandwidth (bit rate) for encoding the video and the reconstruction fidelity. On the other hand, increasing the index decreases the required bit rate. In these experiments the labeling function with index N=7 has been used constantly and the wavelet threshold is constantly zero. Then, the value for the bit rate is calculated at the end of every encoding runs. Therefore, the analyses provided below are based on values of sigma.

The fundamental area of the hexagonal lattice is changed by sigma = 0.1, 0.2… 1, while keeping the shape of the hexagonal lattice fixed. Thus, the quantization error is adapted and central distortion, side distortions and the associated bit rates are affected. The required bit rate for encoding both descriptions must be compared with the required bit rate for encoding the single description source. Suppose that a given video source requires R b/s to be encoded by a single encoder. On the other hand, if for the same video source, the first description of the MD coding scheme requires R1 b/s and the second description requires R2 b/s. The bit rate efficiency is defined as

Reff=R-(R1+R2)RE27

The bit rate efficiency Reffis defined so that the performance of the proposed MD coding scheme is compared with the original video encoder in terms of the required bit rate. Thus, if the MD video coding scheme can encode both side videos with total bit rate (R1+R2) ≤ R then the bit rate efficiency Reffwill be positive and the MD coding is outperforming the original encoder. However, if (R1+R2) > R the the bit rate efficiency Reff will be negative and it indicates that the compression efficiency has been sacrificed for gaining error resiliency ( (Goyal, 2001). Therefore, the bit rates required by the proposed MDCLVQ-H.264/AVC scheme for encoding the two descriptions are provided in Table 4. The bit rates required to encode the standard videos by the JM9.4 reference encoder and the bit rates efficiencies calculated using the Eq. (27) are also provided in Table 4.

According to Table 4, the bit rate efficiencies of the proposed MDCLVQ-H.264/AVC scheme corresponding to sigma= 0.1 to 0.7 for all the videos are positive; therefore, error resiliency and compression efficiency are both obtained and with the sigma lower than 0.7. However, for sigma 0.8 to 1 the compression efficiency is not achieved. As an example, the MDCLVQ-H.264/AVC with sigma = 0.1, in average shows 0.89 bit rate efficiencies for all the videos, which is an improvement as compared with the original encoder. However, for the values of sigma from 0.8 to 1, the proposed scheme requires more bit rates than the original H.264/AVC encoder, therefore, the compression efficiency has been traded off for error resiliency. The error resiliency of the proposed MCLVQ-H.264/AVC scheme is due to its MD coding nature, that is, it can provide a degraded version of the source if only one of the channels works, while a better reconstruction quality is achieved when both channels work and both descriptions arrive at the destination.

VideoSigma0.10.20.30.40.50.60.70.80.91
AkiyoR10.550.570.630.881.772.132.343.084.005.17
R20.550.570.640.891.281.772.423.174.165.45
Total1.101.141.271.773.053.904.766.258.1610.62
R5.005.005.005.005.005.005.005.005.005.00
Reff0.780.770.750.650.390.220.05-0.25-0.63-1.12
CoastguardR10.560.671.402.925.0210.4813.5420.5129.4140.01
R20.550.681.402.905.148.7314.3321.8631.4242.49
Total1.111.352.805.8210.1619.2127.8742.3760.8382.50
R38.5538.5538.5538.5538.5538.5538.5538.5538.5538.55
Reff0.970.960.930.850.740.500.28-0.10-0.58-1.14
CarphoneR10.560.751.623.255.419.3112.7417.9424.2131.39
R20.550.761.613.305.498.6513.0818.4625.0032.27
Total1.111.513.236.5510.9017.9625.8236.4049.2163.66
R30.6630.6630.6630.6630.6630.6630.6630.6630.6630.66
Reff0.960.950.890.790.640.410.16-0.19-0.61-1.08
ForemanR10.561.032.223.876.0410.6512.0415.9020.6726.09
R20.561.052.243.956.128.8812.2916.5021.3727.19
Total1.122.084.467.8212.1619.5324.3332.4042.0453.28
R25.3925.3925.3925.3925.3925.3925.3925.3925.3925.39
Reff0.960.920.820.690.520.230.04-0.28-0.66-1.10

Table 4.

Average bit rates (kb/s) for MDCLVQ-H.264/AVC. The corresponding bit rate efficiencies are also provided.

Although the bit rate efficiency is important for MD video coding, the quality of the reconstructed videos in the central and side decoders are also important. In other words, compression efficiency is not acceptable if it comes with significant drop in the quality of the video. A common measure to evaluate the quality of the video encoding is to compare the reconstructed video with the original video through the average peak signal to noise ratio (PSNR) of the luminance (Y) components. In the following results, the PSNR is measured for the luminance component.

The performance of the MDCLVQ-H.264/AVC scheme in terms of the required bit rate and the reconstruction fidelity is controlled by the fundamental area of the hexagonal lattice. The fundamental area of the hexagonal lattice is changed by sigma. Therefore, in Table 5 the average PSNR values of the reconstructed videos in the central decoder as well as the side decoders for the “Akiyo”, “Coastguard”, “Carphone”, and “Foreman” video sequences are provided in Table ‎5 for different values of the sigma.

Video\sigma0.10.20.30.40.50.60.70.80.91
AkiyoSide 116.3221.9225.1228.1230.6932.8434.7936.3737.7337.48
Side 213.6222.4225.2428.1230.7032.8834.8936.3537.6138.28
Central17.4222.6225.6528.6031.3833.8135.6937.2738.7038.19
CoastguardSide 115.0319.0622.3224.8627.3629.4031.5333.1834.4735.22
Side 213.6419.2322.3224.9227.4029.6331.5933.1734.3435.00
Central16.2919.4722.2225.3327.9730.4032.5334.3235.6536.26
CarphoneSide 115.3919.4023.4726.3629.0931.4833.6835.3536.7637.68
Side 213.5119.8323.4626.4229.1931.5833.7235.3536.6837.50
Central15.4720.1123.9826.9029.8032.4134.7036.5037.9838.78
ForemanSide 114.4619.0923.4125.9728.7030.9433.0834.8236.1536.96
Side 212.6719.3123.3026.0228.7131.0433.0834.8036.0536.82
Central14.5819.6723.8226.4629.3831.9734.0735.9637.3538.05
Miss AmericaSide 116.5323.5828.0731.5734.3936.7538.7239.9840.7241.15
Side 215.4223.6528.0031.6834.1936.7438.6039.9740.6340.97
Central17.1823.9729.0232.1835.2137.6939.6040.8841.5641.74

Table 5.

The average PSNR (dB) of the central and the side decoders for the selected video sequences encoded by the MDCLVQ-H.264/AVC.

As shown in Table 5, the qualities of the reconstructed videos are directly proportional to the value of sigma. In other words, the PSNR of the central decoder and the PSNR of the side decoders improve as the sigma is increased. On the other hand, according to Table ‎4, the bit rate efficiencies decrease as the sigma is increased. Then, the quality of the reconstructed video is reversely related to the bit rate efficiency. Thus, as the value of sigma increases the compression efficiency is traded off for better reconstruction quality. As for example, the MDCLVQ-H.264/AVC shows the best reconstruction quality for the “Foreman” video when the sigma is 1; PSNR = 36.96 in the side 1 decoder, PSNR = 36.82 in the side 2 decoder, and PSNR = 38.05 in the central decoder. However the MDCLVQ-H.264/AVC shows the best bit rate efficiency (Reff=0.96)for the “Foreman” video when the sigma is 0.1.

In addition, according to Table 5 the PSNR values for the central decoder and the side decoders are not significantly improved as sigma is increased from 0.7 to 1. This implies that the best performance of the MDCLVQ-H.264/AVC in terms of the bit rate efficiency and reconstruction quality for the “Foreman” video is seen with sigma = 0.7. As an example, for the “Foreman” video with sigma = 0.7 and with almost the same bit rate as the original encoder (Reff=0.04)the MDCLVQ-H.264/AVC provides PSNR = 34.07 dB in the central decoder, PSNR = 33.08 dB in the side 1 decoder, and PSNR = 33.08 dB in the side 2 decoder.

According to Table 5, the reconstruction performance of the proposed MDCLVQ-H.264/AVC scheme in terms of the central PSNR and side PSNR do not significantly increase while the sigma is increased from 0.7 to 1. In order to demonstrate the visual quality of the reconstructed videos, the first frames of the central videos of “Foreman” sequence that have been encoded by the MDCLVQ-H.264/AVC with sigma = 0.1 to 1 are provided in Fig. 7.

According to Fig. 7, the quality of the reconstructed videos of the “Foreman” increase as the value of sigma is increased. The reconstruction qualities for the “Foreman” videos corresponding to sigma values lower than 0.3 are very low (PSNR = 14.58, PSNR = 19.67dB, and PSNR = 23.82dB) while the reconstruction quality of the videos corresponding to sigma higher than 0.4 are acceptable. For example the reconstructed video corresponding to sigma = 0.7 has central PSNR = 34.07dB.

However the difference between the PSNR corresponding to sigma = 0.7 and sigma = 1 is negligible. In other words, the extra bit rate that is required for higher values of sigma is not worth of the gained reconstruction fidelity, thus the best performance of the MDCLVQ-H.264/AVC for the “Foreman” video is achieved with sigma = 0.7. In other words, using sigma = 0.7 the proposed scheme can offer acceptable side and central reconstruction qualities as well as error resiliency without requiring extra bit rate.

Figure 7.

The 1st frames of the QCIF “Foreman” sequence reconstructed by the central decoder corresponding to sigma = 0.1 to 1.

4.2. MDCLVQ-motion JPEG2000

The proposed MDCLVQ-Motion JPEG2000 scheme generates two descriptions of the input video that are encoded by the OpenJPEG software to motion jpeg files. In this subsection, the volume of the motion jpeg files generated by encoding the MD sequences are compared with size of the motion jpeg file of the original single description scheme and the encoding efficiencies are calculated similar to bit rate efficiency in Eq. (27). Suppose that volume of the motion jpeg file of a given video source is V (kB). On the other hand, if for the same video source the first description of the MD coding scheme generates a motion jpeg file with V1 (kB) and the second description generates a motion jpeg V2 (kB). Then, the compresssion efficiency is defined as

Ceff=V-(V1+V2)VE28

The compresssion efficiency Ceffis defined so that the performance of the proposed MD coding scheme is compared with the original video encoder in terms of the encoding efficiency. Thus, if the MD video coding scheme can encode both side videos with total volume of (V1+V2) ≤ V then the compression efficiency Ceffwill be positive and the MD coding is outperforming the original encoder. However, if (V1+V2) > V, then the compresssion efficiency Ceff will be negative and it indicates that the compression efficiency has been traded off for gaining error resiliency

In these experiments, standard video sequences of “Akiyo”, “Coastguard”, “Carphone”, and “Foreman” with the QCIF format (144×176 pixels) are used to evaluate the performance of the proposed MDCLVQ-Motion JPEG2000 encoding scheme. As stated in section 4.1, the performance of the MDCLVQ-A2 and consequently the performance of the MDCLVQ-Motion JPEG2000 scheme in terms of the compression efficiency and reconstruction fidelity are controlled by the fundamental area of the hexagonal lattice. The fundamental area of the lattice is changed by multiplying the sigma to the generator matrix (transformation matrix). The compression efficiencies for the MDCLVQ-Motion JPEG2000 are provided in Table 6. According to Table 6, the volume of the motion jpeg files generated by the MDCLVQ-Motion JPEG2000 scheme increase as the sigma is increased. Thus, as the sigma increases the encoding efficiency is decreased.

This is similar to the results for the MDCLVQ-H.264/AVC because the encoding (compression) efficiency is due to the coarse degree of the quantization which is controlled by the sigma. In general, the compression efficiencies in Table 4 are lower than corresponding values in Table 5 because H.264/AVC standard has higher encoding (compression) efficiency than Motion JPEG2000.

Although the compression efficiency is important for MD video coding, the quality of the reconstructed videos in the central and side decoders are also important. In other words, compression efficiency is not acceptable if it comes with significant drop in the quality of the video. The performance of the MDCLVQ-Motion JPEG2000 scheme in terms of the compression efficiency and the reconstruction quality is affected by the value of the sigma. The average PSNR values of the reconstructed videos in the central decoder as well as the side decoders for the “Akiyo”, “Coastguard”, “Carphone”, and “Foreman” video sequences are provided in Table 7.

VideoSigma0.10.20.30.40.50.60.70.80.91
AkiyoV1932185525443024359440804340473850615378
V2895187625463134371741144463484051365446
Total182737315090615873118194880395781019710824
V4536453645364536453645364536453645364536
Veff0.600.18-0.12-0.36-0.61-0.81-0.94-1.11-1.25-1.39
CoastguardV11367235231613667423448515138554159126230
V21245236431593786436848385298568860496358
Total26124716632074538602968910436112291196112588
V5797579757975797579757975797579757975797
Veff0.550.19-0.09-0.29-0.48-0.67-0.80-0.94-1.06-1.17
CarphoneV11279254136374269499956606079661370557460
V21193253836304381513656956206670771437518
Total2472507972678650101351135512285133201419814978
V6699669966996699669966996699669966996699
Veff0.630.24-0.08-0.29-0.51-0.70-0.83-0.99-1.12-1.24
ForemanV11080224030443513406746454922534656996012
V21030224330173630419346395055544157956094
Total2110448360617143826092849977107871149412106
V5475547554755475547554755475547554755475
Veff0.610.18-0.11-0.30-0.51-0.70-0.82-0.97-1.10-1.21
Miss AmericaV136885111911403168319102058226124222570
V236085211791463175519392106228824562594
Total728170323702866343838494164454948785164
V1308130813081308130813081308130813081308
V e f f0.44-0.30-0.81-1.19-1.63-1.94-2.18-2.48-2.73-2.95

Table 6.

Volume of the motion jpeg files (kB) and the encoding efficiency of the MDCLVQ-Motion JPEG2000.

The PSNR values of the central decoder and the side decoders of the MDCLVQ-Motion JPEG2000 provided in Table 7 are higher than the corresponding values of the MDCLVQ-H.264/AVC provided in Table 5. This is because of better reconstruction quality of the Motion JPEG2000 as compared with the H.264/AVC. However, the general pattern is the same and all the PSNR values increase as the sigma is increased because reconstruction fidelity is affected by the performance of the MDCLVQ-A2. On the other hand, the compression efficiencies of MDCLVQ-H.264/AVC are in general better than MDCLVQ-Motion JPEG2000 because of higher compression efficiency of the H.264/AVC.

The proposed MD coding schemes require low computational capabilities, due to the inherent symmetries of the coinciding similar sublattices of the hexagonal lattice. The labeling function and the partitioning scheme presented in section 3.4 are not computation intensive processes. The partitions are generated using the simple proposed two-fold real congruency relation. The proposed shift property simplifies the problem of labeling the entire lattice space to the problem of labeling the SuperVoronoi region of the origin. The SuperVoronoi region of the origin with index N includes N2+N-1 lattice points and N2 sublattice points. Thus, the labeling function is simplified to a table look up with N2+N-1 entries. In addition, the proposed labeling function, the partitioning scheme, and the shift property are scalable and as a consequence using the sigma the performance of the entire scheme is adjustable, as shown in section 4.

Video\sigma0.10.20.30.40.50.60.70.80.91
AkiyoSide 115.9525.5129.4532.7534.0033.9935.5636.0736.3639.94
Side 213.6825.3829.2232.3433.2534.4435.0835.6836.1036.29
Central18.8929.3333.1335.5536.3636.5537.1137.3237.4542.38
Coast-guardSide 114.9022.8527.5331.2133.2237.4738.4538.4539.4140.38
Side 212.9923.2527.6230.5032.6434.5435.5236.7337.8038.85
Central18.7027.8032.6035.6038.2537.8139.9340.7241.3642.03
CarphoneSide 116.2724.7629.8933.4435.6835.6238.9639.9041.0441.94
Side 212.8325.0029.6932.7534.6436.4437.6638.8139.9340.98
Central18.2728.6534.5537.9241.0541.1243.3644.1145.0646.15
ForemanSide 116.8124.4028.7532.5434.8634.1638.3239.4040.5041.50
Side 211.2724.6928.8531.8133.6335.5736.7238.0139.1740.19
Central16.1928.8633.6936.9939.7539.4541.9642.8543.7444.64
Miss AmericaSide 116.3426.0731.3633.8235.5235.9537.6038.0838.5938.95
Side 215.8426.0730.9833.5334.8336.1636.9237.6438.1438.68
Central19.8429.4235.1337.6139.0939.4439.8740.1040.2840.57

Table 7.

The average PSNR (dB) of the central and the side decoders for the selected video sequences encoded by the MDCLVQ-Motion JPEG2000.

5. Conclusion

The main objective of this research is to design and implement a generic MD video coding scheme that can be adopted by different kinds of video coding standards. In other words, the proposed scheme does not rely on specific characteristics of the video coding standard being used and does not make any assumption about the state of the channel or probability distribution of errors. In this way, the proposed scheme is different from the MD video coding schemes presented in (Yongdong & Deng, 2003), (Franchi et al., 2005) (Zandoná et al., 2005). Thus, it can effectively increase the performance of the encoding system in terms of compression efficiency and reliability, regardless of the encoding scheme being used. In addition, the proposed scheme does not require significant hardware or software changes; therefore, it can be adopted for available hardware and software designs. Although running parallel H.264/AVC or Motion JPEG2000 encoders requires more computational resources than the original encoder, the total computation time of the MD encoders is less than that of the original because the entropies of the side videos are significantly decreased by the MDCLVQ, and thus, the time required by of the arithmetic coding is decreased.

6. Future research

As for future direction, it would be interesting to develop the MD client-server video encoding and decoding that are able to stream the real time videos on networks. The developed encoder module can be run on the server side and the decoder module is on the client side. The future MD client-server video codec must be adaptive or able to make the best use of the available bandwidth to meet its requirements. In other words, the user must be able to increase or decrease the quality of reconstruction with regard to the available bandwidth. Finally, it is suggested that the MDCLVQ schemes to be integrated with the H.264/AVC or Motion JPEG 2000 video encoding standards in a way that the built-in quantization and encoding procedures are replaced by the MD scheme in order to increase the performance of the MD coding. Thus the performances of the MD video coding schemes are optimized.

© 2013 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Ehsan Akhtarkavan and M. F. M. Salleh (January 9th 2013). Multiple Descriptions Coinciding Lattice Vector Quantizer for H. 264/AVC and Motion JPEG2000, Advanced Video Coding for Next-Generation Multimedia Services, Yo-Sung Ho, IntechOpen, DOI: 10.5772/54296. Available from:

chapter statistics

1341total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Advanced Video Coding for Next-Generation Multimedia Services

Edited by Yo-Sung Ho

Next chapter

Region of Interest Coding for Aerial Video Sequences Using Landscape Models

By Holger Meuel, Julia Schmidt, Marco Munderloh and Jörn Ostermann

Related Book

First chapter

A Survey of Image Segmentation by the Classical Method and Resonance Algorithm

By Fengzhi Dai, Masanori Sugisaka and Baolong Zhang

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More about us