Recent Advances in Watermarking for Scalable Video Coding

The H.264/AVC (ISO/IEC MPEG-4 Part 10) video coding standard (Wiegand & Sullivan, 2003), which was officially issued in 2003, has become a challenge for real-time video applications. Compared to the MPEG-2 standard, it gains about 50% in bit rate, while providing the same visual quality. In addition to having all the advantages of MPEG-2 (ITUT & ISO/IEC JTC 1, 1994), H.263 (ITU-T, 2000), and MPEG-4 (ISO/IEC JTC 1, 2004), the H.264 video coding standard possesses a number of improvements, such as the contentadaptive-based arithmetic codec (CABAC), enhanced transform and quantization, prediction of "Intra" macroblocks, and others. H.264 is designed for both constant bit rate (CBR) and variable bit rate (VBR) video coding, useful for transmitting video sequences over statistically multiplexed networks, the Ethernet, or other Internet networks). This video coding standard can also be used at any bit rate range for various applications, varying from wireless video phones to high definition television (HDTV) and digital video broadcasting (DVB). In addition, H.264 provides significantly improved coding efficiency and greater functionality, such as rate scalability, “Intra” prediction and error resilience in comparison with its predecessors, MPEG-2 and H.263. However, H.264/AVC is much more complex in comparison to other coding standards and to achieve maximum quality encoding, high computational resources are required (Grois et al., 2010a; Kaminsky et al., 2008).


2
include several tools by which the most important scalability modes can be supported. However, the scalable profiles of those standards have rarely been used. Reasons for that include the characteristics of traditional video transmission systems as well as the fact that the spatial and quality scalability features came along with a significant loss in coding efficiency as well as a large increase in decoder complexity as compared to the corresponding non-scalable profiles (Schwarz et al., 2007;. To fulfill these requirements, it would be beneficial to simultaneously transmit or store video in variety of spatial/temporal resolutions and qualities, leading to the video bitstream scalability. Major requirements for the Scalable Video Coding are to enable encoding of a high-quality video bitstream that contains one or more subset bitstreams, each of which can be transmitted and decoded to provide video services with lower temporal or spatial resolutions, or to provide reduced reliability, while retaining reconstruction quality that is highly relative to the rate of the subset bitstreams. Therefore, the Scalable Video Coding provides important functionalities, such as the spatial, temporal and SNR (quality) scalability, thereby enabling the power adaptation. In turn, these functionalities lead to enhancements of video transmission and storage applications (Grois et al., 2010b;Grois et al., 2010c;Grois & Hadar, 2011).
Scalable Video Coding bitsream contains a Base-Layer (Layer 0) and one or more Enhancement Layers (Layers 1,2,etc.), while the Base-Layer provides the lowest bitsream resolution with regard to the spatial, temporal and SNR/Quality scalability, as schematically presented in Figure 1 (Schierl et al., 2007). Fig. 1. Schematic representation of the SVC bitsream: the resolution is increased with the increase of the layer index, while the Base-Layer (Layer 0) has the lowest bitsream resolution (Schierl et al., 2007).

Recent Advances in Watermarking for Scalable Video Coding 3
The term "scalability" refers to the removal of parts of the video bit stream in order to adapt it to the various needs or preferences of end users as well as to varying terminal capabilities or network conditions. According to (Schwarz et al., 2007), the objective of the SVC standardization has been to enable the encoding of a high-quality video bit stream that contains one or more subset bit streams that can themselves be decoded with a complexity and reconstruction quality similar to that achieved using the existing H.264/AVC design with the same quantity of data as in the subset bit stream. Figure 2 below presents a blockdiagram of a SVC encoder, which has for simplicity two spatial layers: Layer 0, which is the Base Layer, and Layer 1, which is the first Enhancement Layer. It should be noted that in order to improve the coding efficiency of the Scalable Video Coding in comparison to simulcasting of different spatial resolutions, additional "inter-layer prediction mechanisms" are incorporated (Schwarz et al., 2007). Fig. 2. Block-diagram of the spatial SVC encoding scheme (for simplicity, only two layers are presented: Layer 0, which is the Base Layer, and Layer 1, which is the first Enhancement Layer).
The Scalable Video Coding has achieved significant improvements in coding efficiency comparing to the scalable profiles of prior video coding standards. As a result, the Scalable Video Coding is currently a highly attractive solution to the problems posed by the characteristics of modern video transmission systems (Schwarz et al., 2007).

www.intechopen.com
Watermarking -Volume 2 4 Scalable Video Coding poses new challenges for watermarking that need to be addressed to achieve full protection of the scalable content (Meerwald, 2011;Lin et al., 2004), while maintaining low bit-rate overhead due to watermarking. Challenges that complicate watermark detection include the very different statistics of the transform domain coefficients of scalable base-and enhancement layers, the combination of multi-channel detection results for incremental detection performance (Piper et al., 2005), as well as the prediction of data between scalability layers which complicates the modeling of the embedding domain. Despite intense research in the area of image and video watermarking (Meerwald, 2011;Lin et al., 2004), the peculiarities of watermarked scalable multimedia content have received limited attention and a number of challenges remain.
One of the main challenges for watermarking the rate-scalable compressed video is that not all receivers will have access to the entire (watermarked) video stream (Lin et al., 2001). The embedded watermark must be detectable when only the base layer is decoded (for layered and hybrid layered/embedded methods) or for a low rate version of the video stream (for embedded methods.) However, the enhancement information adds value to the video stream and should not be left unprotected by a watermark. Ideally, there should be a uniform improvement in the detectability of an embedded watermark as the decoded rate increases.
According to one method for watermarking the rate-scalable video streams, a watermark is embedded in the base layer and a separate watermark is embedded in the enhancement layer(s) (Lin et al., 2001). For temporal scalability, this is an effective method for watermarking as the enhancement information does not alter the frames encoded in the base layer. However, for other forms of scalability, care must be taken so that the multiple watermarks do not interfere with each other once the decoder merges the base and enhancement information. The watermarks could interfere in visibility, where the distortions introduced by adding all watermarks is unacceptable, or detectability, where the presence of all the watermarks impair the ability to detect each watermark individually. The ability to detect each embedded watermark individually (before the enhancement and base information are merged) is not sufficient for a robust watermark, as such a system would be vulnerable to a collusion attack between the non-enhanced and enhanced versions of the video.
For embedded scalability modes, one could design a watermark analogous to an embedded coding scheme, where the most significant structures of the watermark are placed near the beginning of the video stream, followed by structures of lesser significance (Lin et al., 2001).
With this regard, Figure 3 below presents different watermarking embedding schemes by using the SVC spatial scalability (Meerwald, P. & Uhl, A., 2010a).
Watermarking systems are oftern characterized by a set of common features and the importence of each feature depends on the application requrements. As known, the watermarks are generally devided to three main groups (Piper, 2010): a. Robust: Robust watermarks are designed to be resistant to manipulations of the content. Therefore, a robust watermark can be still detected after the content has undergone processing, such as resampling, cropping, lossy compression, and the like. b. Fragile: fragile watermarks are very sensitive to any manipulations to the content. This does not make the fragile watermark inferior to the robust watermark, since different applications demand different amounts of robustness or fragility.
c. Semi-Fragile: semi-fragile wateermarks are designed to be fragile with respect to some changes but to tolerate other changes. For example, they may be robust to compression but will be able to detect malicious tampering. This can be achived by carefully designing the watermark to be robust for particular image/video manipulations. Further, Table 1 below presents common watermarling applications, which are used with regard to different watermark features (Bhowmik, 2010):

Broadband Monitoring
Passive monitoring by the automatic watermark detection of the broadcasted watermarked media.

Copyright Identification
Resolving copyright issues of digital media by using the watermark information as the copyright data.

Content Authentication
Authentication of original art work, performance and protection against digital forgery.

Access Control
Access control applications, such as, Pay-TV.

Copy Control
Disabling copy of CD/DVD by the watermarked permission.

Packaging and Tracking
Transaction tracking and protection against forged consumable items (including pharmaceutical products, and the like) by embedding a watermark on packaging.

Medical Record Authentication
Authentication of digitally preserved patient's medical record, including a blood sample, X-ray, etc.

Insurance / BankingDocument Authentication
Digital authentication of an insurance claim, banking, financial, mortgage and corporate documents.

Media Piracy Control
Tracking of the source of the media piracy.

Ownership Identification
Supporting a legitimate claim, such as, royalty by the the media owner.

Transaction Tracking
Tracking of the media ownership in a buyer-seller scenario.

Meta-data Hiding
Hiding meta-data within the media instead of a big header.

Video Summary Creation
Instant retrieval of video summary by embedding the summary within the host video.

Video Hosting Authentication
Piracy control by video authentication at video hosting servers, including Youtube ™ , etc. Since, the robust watermarking algorithms, which are designed specifically for robustness, are preferred in a majority of watermarking applications, we mainly fosus this chapter on this type of watermarking. Also, we make a special emphazis on the combined schemes of watermarking and encryption by using the H.264/SVC due to the increasing interest with regard to this issue. This chapter is organized as follows: in Section 2, we present recent advances in robust watermarking by using the Scalable Video Coding, in Section 3, we discuss recent advances in the scalable fragile watermarking, then in Section 4, we present recent compresseddomain watermarking techniques by using the Scalable Video Coding, and after that in Section 5, we talk about combined schemes of watermarking and encryption by using the Scalable Video Coding. The future research directions are outlined in Section 6, and this chapter is concluded in Section 7.  (Shi et al., 2010). Especially, in today's society, with the progress of 3G/4G wireless networks and the 7 plurality of heterogeneous mobile devices, the multimedia resources must be accessed by many different terminals, which require the source single multimedia stream to meet the varying terminal capabilities. Thus, the Scalable Video Coding can be efficiently employed to achieve these goals. However, due to the SVC scalability, the source video stream can be decoded into a plurality of streams, each having a different resolution, frame rate and video presentation quality, according to each end-user terminal. Therefore, there are many challenges for watermarking by using the Scalable Video Coding approach (Shi et al., 2010).

Robust watermarking by using scalable video coding
It should be noted that using the prior knowledge of the Scalable Video Coding system and the transmission channel are beneficial for the watermarking system (Meerwald & Uhl, 2008), thereby enabling to use a number of supported spatial and temporal layers, denosing and deblocking filters, and the like (as schematically shown in Figure 4). As it is known, by exploiting the host video as the side-information at the encoder, in message coding and watermark embedding, the negative impact of host signal noise on the watermark decoder performance can be cancelled (Cox et al., 2002).  (Meerwald & Uhl, 2008).
With regard to this issue, (Meerwald & Uhl, 2008) present a frame-by-frame scalable watermarking scheme that is robust for spatial, temporal and quality scalabilities, in which the luminance component of each frame is decomposed using a two-level wavelet transform with a 7/9 bi-orthogonal filter. Separate watermarks are embedded in the approximation and each detail subband layer. According to (Meerwald & Uhl, 2008), an additive spreadspectrum watermark (, ) l wnm is added to the detail subband coefficients , (, ) for each hierarchical layer by using the normalized correlation coefficient detection. By applying a high-pass 3X3 Gaussian filter to the detail subbands prior to the correlation, some of the host interference is suppresses, which improves the detection statistics. Also, a different key is used for each frame to generate the watermark pattern (Meerwald & Uhl, 2008).
Further, (Meerwald, P. & Uhl, A., 2010b) focus on the watermark embedding in the intracoded macroblocks of an H.264-coded base layer. Each macroblock of the input frame is coded by using either intra-or inter-frame prediction, and the difference between input pixels and the prediction signal is the residual. The watermarked SVC base layer representation is used for predicting the SVC enhancement layer, as seen from  As already mentioned, for the scalable watermark system, the key scalable property is that the detection process is scalable (Shi et al., 2010). In other words, the system should be able to detect a watermark in all different scalable bits-streams. As the quality of multimedia decreases, the correlation between the watermark and watermarked signal may be decrease as well. So, it will not work effectively if the same threshold is used for each SVC layer. However, if different detective thresholds are used for different layers, the watermark system is required to transmit some extra side information. One potential measure is that the detective threshold can be adjusted adaptively according to the multimedia content.
With this regard, (Shi et al., 2010) propose a scalable and credible watermarking algorithm towards Scalable Video Coding (SVC), which aims to build Copyright Protection System 9 (CPS). The authors first investigate where to embed the watermark to ensure it can be detected in the SVC Base Layer as well as in the Enhancement Layers, and then the authors propose a model that combines the frequency masking, contrast masking, luminance adaption and temporal masking. Finally, whether watermark exists or not is judged by the adaptive detection, which guarantees the proposed method has a good legal credibility, since its False Alarm Rate (FAR) is close to zero.
In the Section 3, we discuss recent advances in scalable fragile watermarking.

Recent advances in scalable fragile watermarking
The good authentication watermarking can detect and localize any change to the video, including changes in frame rate, video size or related video object (Wang et al., 2006). If the watermarked video is attacked by frame removing, and then the watermark extracting procedure is applied on the attacked video, the procedure returns a false alarm to indicate that the video content becomes incomplete. Also, if one change the size of watermarked video and then one applies the watermark extraction procedure on this resized video, the procedure returns an output that resembles random noise, meaning a false alarm. Similarly, if one modifies certain related video object, then the procedure will output a false alarm (Wang et al., 2006).
With this regard, (Wang et al., 2006) propose to embed the watermark information into the Enhancement Layer of MPEG-4 Fine Granularity Scalability (FGS), as schematically shown in Figure 6, to detect the integrality of video stream. According to (Wang et al., 2006), it is supposed that i w denotes the i-th watermark bit, and j T denotes the total number of "1" bits in the j-th 8x8 bit-plane. The watermark i w should be embedded into the k-th specified bit k B in j-th bit-plane, and the detail of embedding watermarking can be described as follows. First, the specified bit (k-th bit) in the j-th bit-plane is selected by a runlength-selection algorithm for embedding i-th watermark bit. The run-length-selection algorithm can determine a specified bit for embedding watermark in 8x8 residue bit-plane and obtaining an optimal coding efficiency in run-length coding. If i w is "1", then j T will be enforced to be as an odd value. Similarly, if i w is "0", then j T will be enforced to be as an even value. That is, the specified bit k B can be modified as ' k B by the following expression: , and "  " denotes the exclusive "OR" operation.
Since fragile watermarking has extremely low resistance for various attacks, the extracted watermark signal fairy easy lose its completeness when multimedia content is modified or changed by a pirate or hacker. Thus, the multimedia can be determined where it has been changed or modified illegally according to the completeness of extracted watermark. (Wang et al., 2006) propose a BCW (Bitplane-Coding Watermarking) algorithm to add watermark information to the residual bit-planes of the Enhancement Layer. In embedding procedure, the watermark information is embedded into every 8×8 block of residual bit-planes in the Enhancement Layer, while encoding to MPEG-4 FGS video stream. The watermark bit is modulated by modifying a specified bit that is selected from each 8×8 bit-plane such that the even/odd value of the total number of "1" bits can meet the corresponding watermark information. The main reasons for hiding watermark into enhancement layers is that minimal degradation of the host data can be imperceptible as the watermark signal is inserted into the enhancement layer. Fig. 6. Embedding a watermark in an Enhancement Layer of the MPEG-4 FGS video stream (Wang et al., 2006). Figure 7 is presented a block diagram for the watermark extraction from the Enhancement Layer of MPEG-4 FGS video stream (Wang et al., 2006). If ( ) j ET is "1", the extracted watermarking data is equal to "1". Otherwise, if ( ) j ET is "0", the extracted watermarking data is also "0". The equation for extracting watermark can be expressed as follows:

In turn, in
www.intechopen.com Watermarking for Scalable Video Coding   11 where ( 0,1,2,3,4,...) i wi  is the i-th data of watermark. Also, in the watermark extracting of (Wang et al., 2006), the received Enhancement Layer (EL) stream with the watermarking data can be decoded to bit-planes through the Variable-Length Decoding (VLD) at the receiver end.

Enhancement Layer Stream
Key Fig. 7. Extracting a watermark from an Enhancement Layer of the MPEG-4 Fine Granularity Scalability (FGS) video stream (Wang et al., 2006).
In the following Section 4, we discuss compressed-domain watermarking by using Scalable Video Coding techniques.

Compressed-domain watermarking by using scalable video coding
The concept of scalable watermarking is composed of the expansion of progressive coding and the watermark system (Seo & Park, 2005). Progressive watermarking techniques enables to transmit images with a built-in watermark progressively, and then to extract the watermark from the decoded images. The scalable digital watermarking is mostly related to the scalable video coding techniques. Therefore, the scalable digital watermarking enables to protect contents regardless of the transmission of a specific domain, and enables to extract watermark from any domain of the scalable contents. Also, the increase of the scalable domain can also reduce an error of the watermark extraction (Piper et al., 2004). In Figure 8, the compression is performed on the original image after the wavelet transform, and the selected coefficients and watermark key are combined, followed by the spectrum quantization and encoding (Seo & Park, 2005). Therefore, by progressively transmitting the image from the low frequency band to the high frequency band, the receiver can extract the watermark from the corresponding image portion, which that contains the built-in watermark; the bit error rate is decreased, as the transmitted data of images, with the builtin watermark, is increased (Seo & Park, 2005).  (Seo & Park, 2005).
In the following Section 5, we discuss combined schemes of watermarking and encryption by using the H.264/SVC.

Combined schemes of watermarking and encryption by using Scalable Video Coding
Intellectual Property (IP) protection is a critical element in a multimedia transmission system (Chang et al., 2004;Chang et al., 2005). Conventional IP protection schemes can be categorized into two major branches: encryption and watermarking. The content protection can be increased when combining the encryption and the robust watermarking, as proposed and implemented by (Chang et al., 2004;Chang et al., 2005). By taking advantage of the nature of cryptographic schemes and digital watermarking, the copyright of multimedia contents can be well protected.  Encryption before compression: There are no dedicated encryption proposals that take SVC-specifics into account (Stutz & Uhl, 2011).  Compression/Integrated encryption: The base layer is encoded similar to AVC, thus all encryption schemes for AVC can be basically employed in the base layer. The enhancement layers can employ inter-layer prediction, but not necessarily have to, e.g., if inter-layer prediction does not result in better compression. The compression integrated encryption approaches for AVC can be applied as well for SVC, e.g., the approaches targeting the coefficient data can also be applied for SVC.  Bitstream/ Oriented encryption: The approach of (Stutz & Uhl, 2008)  almost the entire NALU payload. As the NALU structure is preserved, scalability is preserved in the encrypted domain.
The scalable transmission method over the broadcasting environment for layered content protection is adopted by (Chang et al., 2004;Chang et al., 2005). As a result, the embedded watermark can be extracted with the high confidence and the next-layer keys/secrets can be perfectly decrypted and reconstructed. The watermarking is added to order to aid the encryption process, since the watermarked data content can withstand different types of attacks, such as distortions, image/video processing, and the like.
Further, (Park & Shin, 2008) presents a combined scheme of encryption and watermarking to provide the access right and the authentification of the video simultaneously, as schematically presented in Figure 9. The proposed scheme enables to protect the data content in a more secure way since the encrypted content is decrypted when the watermark is exactly detected. The encryption is performed for the access right, and the watermarking is implemented for the authentication. Particulalry, the encryption is preformed by encrypting the intra-prediction modes of the 4x4 luma block , the sign bits of texture, and the sign bits of MV difference values in the intra frames and the inter frames. In turn, a reversible watermarking scheme is implemented by using intra-prediction modes. The watermarking scheme proposed by (Park & Shin, 2008) has a small bit-overhead; however, no degradation of the visual quality occurs. Fig. 9. Combined scheme of encryption and watermarking (Park & Shin, 2008).
The method of (Park & Shin, 2008) is applied in the Scalable Video Coding on the macroblock (MB) level in the Base Layer. The encryption and watermarking are implemented in the encoding process almost simultaneously. In turn, in the decoding process, the receiver's device extracts the watermark from the received bitstream. The extracted watermark is compared to the original one. If they match, then the received video s trusted and the encrypted bitsream is decrypted. In other words, according to (Park & Shin, 2008), only authenticated contents can be decoded in the decoding process.
In the following Section 6, we present possible future research directions for optimizing the existing watermarking techniques for use with the Scalable Video Coding.

Future research directions
The existing watermarking techniques for the Scalable Video Coding have still many issues to be solved in order to provide a complete solution, and possible future research directions can be outlined as follows (Bhowmik, 2010):

Conclusions
In this chapter we have presented a comprehensive overview of recent developments in the area of watermarking by using the Scalable Video Coding. As discussed, the Scalable Video Coding poses new challenges for watermarking, which have to be addressed to achieve full protection of the scalable content, while maintaining low bit-rate overhead due to watermarking. Particularly, we presented recent advances in robust watermarking and discussed recent advances in the scalable fragile watermarking; also, we presented recent compressed-domain watermarking techniques by using the Scalable Video Coding, and presented combined schemes of the SVC watermarking and encryption.
As clearly seen from this overview, there are still many challenges to be solved, and therefore further research in this field should be carried out.