Comparison of “Spread-Quantization” Video Watermarking Techniques for Copyright Protection in the Spatial and Transform Domain

In spite of the existence of watermarking technique for all kinds of digital data, most of the literature addresses the watermarking of still images for copyright protection and only some work is extended to video watermarking. Video watermarking is distinct from image watermarking, because there is more data available to both the attacker as well as to the watermarker. This additional data volume allows the payload to be more redundantly and reliably embedded.


Introduction
In spite of the existence of watermarking technique for all kinds of digital data, most of the literature addresses the watermarking of still images for copyright protection and only some work is extended to video watermarking.Video watermarking is distinct from image watermarking, because there is more data available to both the attacker as well as to the watermarker.This additional data volume allows the payload to be more redundantly and reliably embedded.
Video watermarking schemes are characterized by the domain that the watermark is being embedded or detected, their capacity, the perceptual quality of the watermarked videos and their robustness to particular types of attacks.They can be divided into three main groups according to the domain in which the watermark is embedded: spatial domain, frequency domain and compressed domain watermarking.An overview of video watermarking techniques can be found in (Gwenael & Dugelay, 2003).
The spatial domain algorithms embed the watermark directly into the pixel values and no transforms are applied to the host signal during the embedding process.The most common techniques to insert the watermark into the host data in the spatial domain is via Least Significant Bit modification, Spread Spectrum Modulation and Quantization Index Modulation.
The easiest way to embed a watermark in the spatial domain is the LSB method.If each pixel in an image is represented by an 8-bit value, the image/frame can be sliced up in 8 bit planes.The least significant bit plane does not contain visually significant information and can easily be replaced by the watermark bits.There are also some more sophisticated algorithm that makes use of LSB modification (Kinoshita, 1996).These techniques are not very robust to attacks because the LSB plane can be easily replaced by random bits, removing the watermark.
Spread spectrum watermarking views watermarking as a problem of communication through a noisy channel.As a means to combatting this noise or interference, spreadspectrum techniques are employed to allow reliable communication in such noisy environments.In this case, the watermark data is coded with a pseudorandom code Image and video watermarking in the Discrete Cosine Transform domain is very popular, because the DCT is still the most popular domain for digital image processing.The DCT allows an image to be broken up into different frequency sub-bands, making it much easier to embed watermarking information into the middle frequency sub-bands of an image or video frame.One of the first DCT based algorithms, upon which many variations have been based, is presented in (Cox et al., 1997).The watermark is a normally distributed sequence of real numbers added to the full-frame DCT of each video frame.More advanced techniques were also proposed in (Suhail & Obaidat, 2003), (Liu, L. et al., 2005), (Yang et al., 2008).The choice of the DCT coefficients for watermark embedding is a compromise between the quality degradation of the image/frame (frequency of the coefficients should be high) and the resilience of the watermarking scheme to attacks (frequency of the coefficients should be low).
The rest of this chapter is organized as follows: Section 2 describes the three proposed video watermarking techniques, providing detailed diagrams and description of the watermark embedding and extraction strategies.Section 3 contains the experimental results and a detailed comparison of the proposed methods in terms of perceptual quality and robustness to different attacks.Finally, Section 4 presents the conclusions of our work and possible future research.

Proposed "Spread-Quantization" video watermarking techniques
This section presents our watermarking schemes in the spatial, Discrete Cosine Transform (DCT) and Wavelet domain.First we will summarize some common properties of the proposed algorithms and then, in Subsections 2.1 to 2.3 the detailed embedding and extraction schemes will be presented for every method.
The proposed watermarking techniques are a combination of spread-spectrum and quantization based watermarking.That is why we call them "spread-quatization" techniques.
Our methods embed the watermark into the luminance values of the pixels or into some selected coefficients in a transform domain, thus all algorithms will first do a conversion of the RGB (Red, Green, Blue) color space into the YC b C r (ITU-R BT.601) color space, as shown in Equation ( 1 To improve the resilience of the proposed algorithms to attacks, two protection mechanisms are used: The watermark is coded using a low complexity error correction code (m,n), where n is the dataword length and m is the codeword length.Using the error correction code, the useful size of the watermark will be m/n times smaller in comparison to the case when no error correction code is used.


The same watermark is redundantly embedded in a number of k frames.Thus, the useful size of the watermark will be k times smaller, but the resilience to attacks is improved.At the watermark decoder, after extracting the watermark sequence  i w of size P' bits from every frame of a number of k frames, a bit of the useful watermark () wj  is computed using Equation (3).
 

Video watermarking scheme in the spatial domain
The watermark embedding process, illustrated in Fig. 1, is described in the following steps: 1.The original video is partitioned into groups of k frames.
2. Every frame of the group is converted to the YC b C r format as in Equation (1).
3. The binary image matrix is transformed into a binary row vector w of size Phv  .4. To protect the watermark against bit errors, a Hamming error correction code (m,n) with codeword length of m bits and data-word length of n bits is applied to the vector w.The size of the resulting watermark vector w c is: The binary sequence w c is partitioned into a number of , F is the number of frames of the video and k is the number of redundant frames.The dimensions h and v of the watermark are chosen so that k P F  is an integer.The same sequence () c wj will be inserted into every frame of a group j of k frames.
where [.] is the integer part operator.6.A spread-spectrum technique is used to spread the power spectrum of the watermark data, thus, increasing its robustness against attacks.First a binary pseudo-random sequence   2 {0,1}, 1,..., rr Ss s r l   of size 2 l with equal number of zeros and ones is generated using the Mersenne-Twister algorithm proposed in (Matsumoto & Nishimura, 1998) with the use of the last 64 bits of the secret key K as seed for the generator.This method generates numbers with a period of 19937  (2 1)/2  .7. For every bit of the watermark () c wj, the corresponding spread spectrum sequence is: 8. A sequence S (representing one bit of the original watermark) is embedded in every bloc of ll  luminance values.9.A bit of S is embedded into the luminance value of the pixel of the same index by rounding its value to an even or odd quantization level.Rounding to an even quantization level embeds a "0", while rounding to an odd quantization level embeds a "1", as shown in Equation 6: where (, ) Li j is the original luminance value, (, ) w Li j is the watermarked luminance value, q is the quantization step size and sign() is defined as: Table 1.Example of embedding a watermark bit into a block of 4x4 luminance pixels 10.The video is converted back to the RGB format using Equation 2, obtaining the watermark video.
The choice of the quantization step q is a tradeoff between the perceptual quality of the watermarked video (q must have a small value) and the resilience of the watermarking scheme to attacks (q must have a big value).An example of embedding a watermark bit into a block of 4x4 pixels is given in Table 1.
The watermark extraction process, shown in Fig. 2, implies the following steps: 1.The watermarked video is partitioned into groups of k frames.
2. Every frame of the group is converted to the YC b C r format using Equation 1.
3. Every luminance frame is partitioned into square blocks of ll  luminance values.
4. A bit of the spread spectrum sequence ss w  of size l 2 is extracted from every luminance value of a block of size ll  using Equation ( 9): where w is the extracted watermark bit, (, ) w Li j is the luminance value of the pixel at position (i,j), q is the quantization step size and mod2 is the modulo2 function.5. Using the 64 bit seed from the secret key K the binary sequence S is generated locally.6.The extracted watermark bit for the corresponding block is: 8. The resulting watermark bitstream c w of size P' is error corrected and the watermark w of size P is obtained.9.The extracted binary image is obtained by reshaping the vector w to a matrix of size hv  .
The choice of the quantization step size q is a tradeoff between the perceptual quality of the watermarked video (q should have a small value) and the resilience of the watermarking scheme to attacks (q should have a big value).

Video watermarking scheme in the Discrete Cosine Transform domain (DCT)
For this method, the watermark is redundantly inserted in the DCT domain.Compared to the previous method in the spatial domain this technique works with blocks of 8x8 luminance pixels.Every Y block is transformed into a 8x8 DCT coefficient block.To insert the watermark, only 22 DCT coefficients from every block are used, as shown in Fig. 3, where the white coefficients are ignored and only the gray coefficients are used for redundant watermark embedding.Instead of step 6 of the spatial domain embedding strategy, this algorithm calculates the number b of 8x8 DCT coefficient blocks, where the same watermark bit can be redundantly embedded, as shown in Equation ( 12).
1 64 where MxN is the resolution of the video, F is the number of frames of the video, k is the number of redundant frames,  P is the watermark size after applying the error correction code and [.] is the the integer part operator.The same watermark bit will be inserted in a number of 22 DCT coefficients using the spread-spectrum sequence ( ) ss wi obtained using Equation ( 14).At the decoder side (Fig. 5) first the number b of DCT coefficient blocks is calculated.From every coefficient selected according to Fig. 3 a bit is extracted using Equation ( 16), resulting in a sequence ( ) ss w j  of 22 bits from every block.mod 2 The spread-spectrum sequence ss w corresponding to an inserted watermark bit is obtained from b blocks of coefficients as in Equation ( 17).

 
Then the pseudo-random bit sequence S is locally generated using the secret key K.The extracted watermark bit b w  corresponding to a group of b coefficient blocks is computed in Equation (18). .Every bit of the sequence c w corresponding to a group of k frames is determined using Equation ( 19): The bit sequence () wi  is then error corrected obtaining the extracted watermark sequence w .

Video watermarking scheme in the wavelet domain
For this method the watermark is embedded in the selected wavelet coefficients of the luminance Y of every frame of the video.The wavelet decomposition of the luminance is done using the 2D Discrete Wavelet Transform.We have chosen a Wavelet decomposition on L=3 resolution levels.The watermark is embedded in the wavelet coefficients of the LH, HL and HH sub-bands of the second Wavelet decomposition level.The choice of the second decomposition level is a tradeoff between the invisibility of the watermark and the resilience to attacks.A watermark embedded in the wavelet coefficients of the LH 1 , HL 1 and HH 1 subbands is very sensitive to attacks, because these sub-bands contain the finest details of the frame.On the other hand, if we embed the watermark in the LH 3 , HL 3 and HH 3 sub-bands, the perceptual quality of the video will be significantly altered.For these reasons, the best choice for watermark embedding is the second wavelet decomposition level.Fig. 6 shows the sub-bands (gray color) selected for watermark embedding.

For videos of resolution M N
 , the number of selected wavelet coefficients for a frame is: The maximum capacity of the watermarking scheme is CF C   where F is the number of video frames and can be achieved by embedding a watermark bit in every selected wavelet coefficient.For example, for CIF videos of resolution 352x288 and 30 frames/s, the maximum capacity is 556kb/s.This maximum capacity is not needed in most applications, thus we will reduce it to improve the robustness of the scheme.Fig. 7 shows the block diagram of our Wavelet based watermark embedding scheme and is described in the following steps: 1.The binary image matrix is transformed into a binary row vector w of size Phv  .2. To protect the watermark against bit errors, a Hamming error correction code with codeword length of m bits and dataword length of n bits is applied to vector w.The size of the resulting watermark vector w is: 3. A same spread-spectrum technique is used to spread the power spectrum of the watermark data, thus, increasing its robustness against attacks.First the binary pseudorandom code sequence   {0,1}, 0,1,..., jj Ss s j G   with equal number of zeros and ones is generated using the Mersenne-Twister algorithm with the use of 64 bits of the secret key K as seed for the generator.For every bit of the watermark w', the corresponding spread spectrum sequence is:  , ] , ( ) 1

Every sequence ()
ss wi (representing one bit of the original watermark) is embedded into a number G of wavelet coefficients, every bit of () ss wi in a wavelet coefficient.The number G depends on the number C of the selected wavelet coefficients, the number of frames F of the original video and the size '  P of the watermark: where [.] is the integer part operator.
www.intechopen.com 5.A bit of the binary sequence S is embedded in the selected wavelet coefficient by rounding its value to an even or odd quantization level.Rounding to an even quantization level embeds a "0", while rounding to an odd quantization level embeds a "1", as shown in Equation ( 5):

, 22
w dd dq q w s i g n dq qq where d is the original wavelet coefficient, w d is the watermarked wavelet coefficient and q is the quantization step size.
6.After the entire watermark has been embedded, the 2D Inverse Discrete Wavelet Transform is computed for every frame to obtain the watermarked video.
The watermark extraction process, shown in Fig. 8, implies the following steps: 1. Wavelet decomposition of the watermarked, possibly attacked video; 2. Selection of the wavelet coefficients used for embedding; 3. Computation of the parameter G using the information about the size of the watermark provided by the secret key K; 4. From every coefficient selected according to Fig. 6 a bit is extracted according to Equation ( 25), resulting in a sequence ( ) ss w j  of G bits from every group.
7. The resulting watermark bitstream of size P' is error corrected and the watermark w of size P is obtained.8.The extracted binary image is obtained by reshaping the vector w to a matrix of size hv  .
To improve the resilience of the algorithm against temporal attacks we embedded the same watermark redundantly in every k frames.Thus, the number of wavelet coefficients used for embedding a watermark bit is decreased from G to G/k.

Comparison of the proposed "Spread-Quantization" video watermarking techniques
The simulation results were conducted on the first 27 frames of the videos "stefan", "forman" and "bus" in RGB uncompressed avi format, of resolution 352x288 (Common Intermediate Format), 24 bits/pixel and frame rate of 30 frames/s.The binary image used as watermark is shown in Fig. 9.The resolution of the image depends on the error correction code used, the number of redundant frames and the resolution of the initial video.We have conducted the experiments for every proposed method using the quantization step sizes 2 q  , 4 q  and 8 q  , no redundant frame embedding, embedding of the same watermark in 3 k  and 9 k  frames, without using an error correction code and using a Hamming (7,4) error correction code.
First we wanted to test the perceptual quality of the watermarked videos.To compare the watermarked video with the original one, we computed the mean Peak Signal to Noise Ration (PSNR) of all frames of the video.

()
where F is the number of frames of the video.
The PSNR results are shown in Fig. 10.We can see that the best quality for every quantization step size chosen is obtained using the Wavelet approach, followed by the DCT and the spatial method.The PSNR results for the spatial watermarking scheme are quite low for quantization with bigger quantization step sizes (for 4 q  and 8 q  below the accepted www.intechopen.comFig. 10.PSNR values for the three proposed methods for different quantization step sizes value of 40 dB).For 8 q  only the wavelet based technique achieves a PSNR value higher than 40 dB.
For a visual comparison, Figure 11 shows the fifth frame of the original stefan video and the corresponding watermarked frames for the three proposed methods using the quantization step sizes 2 q  , 4 q  and 8 q  .Next, we wanted to test the robustness of the proposed watermarking schemes.For this purpose we have carried out a range of eight attacks on the watermarked videos (see Table 2).The parameters of the attacks were chosen in such a manner, that the visual degradation of the attacked videos is acceptable, because, by attacking a watermarked video, an attacker wants to destroy the watermark, but not the video quality.
To evaluate the robustness objectively, we have calculated the mean values of the decoding BER for the watermarks extracted from all test videos after they were attacked: where out w is the extracted watermark, in w is the original watermark and P is the size of the watermark.We have plotted 9 different graphs , where we represented the mean decoding BER for every method and every attack.The variables are the quantization step size q (chosen 2, 4 and 8) and the number of frames k used for embedding the same watermark (chosen 1, 3 and 9).For 2 q  no error correction code was used, because the corresponding BER values are quite high and the Hamming (7,4) error correction would not work for such high bit error rates.For 4 q  and 8 q  , where the BER values are lower, we used the Hamming (7,4) error coreection code, which can correct single bit errors.Table 2. Attacks against the watermarking schemes Fig. 12.Comparison of the decoding BER (%) for the proposed methods using 2 q , no redundant frame embedding and no error correction code The method working in the spatial domain is very vulnerable to the brightening attack.For example by adding Y=6 to every luminance value, the decoding BER is 100% for every combination of parameters.We didn't represent this value on the graphs, because we didn't want to scale all BER values to 100%.On the other hand, the spatial embedding method has the best resilience to median filtering attacks.The DCT based technique is more vulnerable to the median filtering attack than the other two methods.4. Watermarks extracted from the watermarked "stefan" video after various attacks for 8 q  , 3 k  and Hamming (7,4) error correction The best overall resilience is achieved by the method working in the wavelet domain, being the only technique with perfect decoding of the watermark for 8 q  , 9 k  and Hamming (7,4) error correction.The second most resilient method is the DCT techniques, followed by the spatial technique.
Tables 3 and 4 contain the watermarks extracted after each attack from the video sequence "stefan", using the three different approaches, k=3 redundant frames, Hamming (7,4) error correction code, 4 q  and 8 q  , respectively.These tables show the advantage of using a binary image as watermark.We can see that the extracted watermarks can be identified easily for bit error rates below approximately 15%.

Conclusion
In this chapter we have compared three blind "spread quantization" video watermarking techniques in the spatial, DCT and wavelet domain.The original watermark and the original, unwatermarked videos are not required the watermark extraction process.The methods are combinations of spread-spectrum and quantization based techniques.All three schemes embed the watermark in the luminance channel or in the transform coefficients of the luminance.The watermarks used are binary images, containing the copyright information.The watermark is protected against singular bit errors using a Hamming error correction code.
The spatial domain technique embeds a watermark bit by spreading it in a luminance block.The actual embedding into a luminance value is done using a quantization based approach.
The DCT domain technique spreads the same watermark bit into a number of 8x8 DCT blocks.In every DCT block only 22 middle frequency DCT coefficients are used for embedding.The wavelet based technique embeds the same watermark bit into a number of detail wavelet coefficients of the middle wavelet sub-bands.
The resilience of the schemes is improved by redundantly embedding the same watermark in a number of k video frames.
We have tested the perceptual quality of the watermarked videos and the resilience of the schemes to eight different attacks in the spatial, temporal and compressed domain, for different quantization step sizes and different number of redundant frames.
The experimental results show, that the wavelet domain technique achieves the highest video quality and the best robustness to most attacks, followed by the DCT and spatial domain techniques.The spatial domain method is most vulnerable to the brightening attack and the DCT method to the median filtering attack.The wavelet based technique achieves very good overall scores, being the best candidate for robust video watermarking.
Future research directions include the improvement of our wavelet based watermarking techniques in terms of robustness to the proposed attacks, but also to other temporal and geometric attacks.The quality of the watermarked videos could also be improved by using a Human Visual System (HVS) approach.These techniques are usually time consuming and a tradeoff has to be made between the perceptual quality of the watermarked videos and the arithmetical complexity of the scheme.Transactions on Information Forensics and Security,vol. 3,no. 3,

References
Fig. 1.Block diagram of the spatial watermark encoder 5.The size l of a square bloc of  ll luminance values is calculated to embed a bit of the watermark: Fig. 2. Block diagram of the spatial watermark decoder 7. A binary sequence , () ci wj  is extracted from every frame of a group of k frames, where 1, ik  .The sequence () c wj  is computed from , () ci wj  using Equation (11):

Fig. 6 .
Fig. 6.Wavelet sub-bands selected for watermark embedding the 64 bit seed from the secret key K the binary sequence S of size G is generated.6.The extracted watermark bit () wi  corresponding to a group of G wavelet coefficients is computed in Equation (26).

Fig. 9 .
Fig. 9. Binary image used as watermark Fig. 11.Visual comparion of the proposed methods.The fifth frame of a) the original "stefan" video, b) the watermarked video using the spatial approach, c) the watermarked video using the DCT approach and d) the watermarked video using the Wavelet approach

Fig. 13 .
Fig. 13.Comparison of the decoding BER (%) for the proposed methods using 2 q  , 3 k  and no error correction code Comparison of "Spread-Quantization" Video Watermarking Techniques for Copyright Protection in the Spatial and Transform Domain 167 Every coefficient of index i is quantized to an even or odd number of quantization step sizes according to the value of the bit () www.intechopen.comw ci is the watermarked DCT coefficient.