Hardware Implementation of Audio Watermarking Based on DWT Transform

Presently, the duplicate copy of an audio can be generated with great ease using some smart devices, and transmitted over the internet which raises concern over copyright and privacy. Digital audio watermarking is a procedure to insert some data bits known as watermark into audio signal. Then the audio with watermark is to be transmitted to end user or made public. The proposed algorithm is used to insert a binary watermark image into a detailed coefficient of the Daubechies 9/7-based DWT transform. A watermark is dispersed consistently in low frequencies, which builds the robustness and inaudibility of the watermark data. Further, the watermark is embedded into an audio signal to have robust system against audio attacks and inaudible performance. The algorithm is verified using MATLAB and subsequently implemented on FPGA hardware to verify the real-time performance. Hardware implementation helps to embed the watermark at the same instance when audio is being captured. The results show promising application for real-time audio applications.


Introduction
In a present digital era, a digital file like audio can be copied easily to a computer and other smart devices, and distributed on open network. However, this has prompted issues such as maintaining copyright, ownership, particular person authentication, privacy, and sensitive information loss [1]. The possible solution is to insert some ownership data bits into the audio which would be extracted for the purpose of the authentication. Digital audio watermarking is a technique where a watermark is embedded in the original audio media file. Subsequently, the secured watermarked may be transmitted over internet to any other person. Inaudibility and robustness are two primary characteristics of a digital audio watermark. Robustness is defined as the ability of watermark to resist channel attacks like echo addition, filtering and Gaussian noise, etc. [2]. Inaudibility means the insertion of the watermark should not have any impact on final watermarked audio. Ownership protection helps to identify the content for the originator in order to protect his copyrights. Illegal use of audio without consent, leaking sensitive information, etc. can be prohibited by embedding owner signature into original audio in real-time [3].
The main objective is to design an algorithm which is robust, blind and inaudible and useful for audio applications.

Digital watermarking overview
The following section gives a brief overview of digital watermarking. Some basic terms, watermark classifications, watermark properties, and applications covered under this chapter. The following list contains the meaning of some standard terms used in this chapter.
• Host audio is the source audio signal.
• Watermark is defined as a signal consisting of data embedded into a host/ carrier audio signal.
• Watermark Embedding is the process of inserting the ownership data into host audio.
• Blind Watermarking is a technique in which there is no need of source audio for watermark extraction.
• Watermark extraction is a procedure to retrieve back to our embed watermark.
• The payload is the size of the message encoded in object [4].

Problem statement
There are so many audio watermarking algorithms which are implemented in previous year. Most of the algorithms are implemented on the MATLAB only and then it checks its robustness and inaudibility. In MATLAB, the transform function is generally used and according to that an audio watermarking algorithm is applied to frequency domain. In MATLAB, the transform function is generally used and according to that an audio watermarking algorithm is applied in the frequency domain. The power consumption is also unknown and also do not have any knowledge about execution time of the algorithm. These are the some fundamental requirement to design any algorithm on hardware so MATLAB does not provide any kind of hardware support. The hardware implementation of algorithms are achieved on DSP processor and also on GPU processor level. DSP processor and GPU processor may give hardware implementation but its hardware complexity is very high and they are not compatible with the real-time applications [5]. So, VLSI architecture is the best suitable platform for reducing hardware complexity and designing on real-time applications.

Objectives
The Proposed design of audio watermarking algorithm is implemented on MATLAB. Subsequently VLSI architecture of the audio watermarking algorithm is developed. Then a Forward DWT transform algorithm is developed in Xilinx ISE which is followed by inverse DWT algorithm. Then design VLSI architecture of blind audio watermarking algorithm is developed. Here the main objectives of this proposed work is design VLSI architecture of the blind audio watermarking algorithm and also check its area and timing calculation. The proposed work is also designed to have compatibility with real-time application.

Previous work and my contribution
Digital audio watermarking is used for correct owner identification, prevention of fragile and copying and also providing a particular person authentication of their digital property. There are many digital audio watermarking algorithms are designed and simulated on MATLAB platform. So many types of audio watermarking methods present in a previous year [6][7][8]. Also, there is a DWT SVD-based audio watermarking algorithm is implemented in previous work [9]. This work based on semi-blind audio watermarking-based algorithm and a digital watermark is applied on DWT-SVD transform with robustness and imperceptible. The proposed algorithm is a blind digital audio watermarking scheme using DWT algorithm. There are several hardware implementation of the DWT algorithm [10][11][12]. In the proposed algorithm, the reduced the complexity of the DWT is designed along with its inverse DWT algorithm. The real-time application requires high speed of the algorithm. Our algorithm gives less delay with complete synchronization which does not require any control segment as suggested by many scholars which increase delay. Here, the hardware implementation uses only adders, subtractors and shifters so multiplier-less designed would help to have the hardware efficient and very fast algorithm.

Hardware solution
The scheme of watermarking is implemented either using software or hardware. In a software implementation, a watermark algorithm is executed on a processor. The software implementation is flexible, but the software implementation is used to embed watermark on offline process where algorithm runs on PC for audio captured through the device. However, the hardware implementation helps to insert the watermark online when the audio is being recorded itself. Then again, in a hardware implementation, a watermark calculation is entirely performed in specially crafted hardware. A hardware implementation consumes less area and less power contrasted with a software implementation [5]. The hardware implementation may have the advantage of parallel processing and poses lesser delay compared to software. This chapter is targeting a real time application, so hardware solution is best recommended. Initially, the proposed audio watermarking algorithm is implemented on the MATLAB; however MATLAB provides only the simulation platform to validate the performance [13]. The real-time implementation of the proposed audio watermarking is achieved in Xilinx ISE software and simulated result of the audio watermarking is discussed. Here DWT transform is implemented by using adder/subtractor and shifter only. Then steps of both embedded and extraction process of the digital audio watermarking is implemented. Subsequently, the proposed watermarking is also synthesized using Xilinx ISE14.7.

Digital watermarking
Watermarking is a method through which the protected data conveyed without much observable change in the watermarked content. The watermarking process comprises of two main steps: (i) embedding method and (ii) extraction method. The secret key could be used for additional level of security. There are fundamentally three sorts of watermarking methods and are described here: (1) non-blind watermarking, (2) semi-blind and (3) blind watermarking. The process of watermarking that uses the original sound signal during the extraction procedure termed as "non-blind watermarking". The watermarking system uses a portion of the segment or some a part of the input audio signal then it is term as a "semi-blind watermarking". The watermarking system helps to retrieve the watermark without use of original audio signal or a part of an audio signal for extraction process termed as "blind watermarking" [14]. The paper covers proposed novel blind audio watermarking scheme and its hardware implementation is performed in Xilinx ISE. The steps of algorithms are covered in Section 3. The watermark consists of a data sequence of binary bits which is inserted into the host signal. The audio watermarking scheme should have following basic characteristics: inaudibility, payload and robustness. Figure 1 gives the visual representation of the requirements of data watermarking concept in digital audio, these three requirements forms the corners of the magic triangle.

Inaudibility:
The inserted data has to be "inaudible" in the watermarked digital music. Evaluation of the same is quantified using signal-to-noise ratio (SNR).

Security:
The algorithm should be secure where authorized person should able to only retrieve the watermark. The attempt of extracting watermark is to be unsuccessful for an unauthorized user in any case.

Robustness:
The watermark should not be eliminated or removed by applying common processing techniques such as cropping, nonlinear and/or linear filter, lossy compression, etc.

4.Paylod (capacity):
It is defined as total information to be embedded in the host without having of any distortion. It is usually defined as the bit rate for the audio signal which is the actual number of bits inserted in the original audio and is provided by bits per second (bps).

Real-time processing:
The process of inserting into the original signal without much delay. It should be able to insert the watermark at same instance when the audio is being recorded.

Proposed audio watermarking
The proposed audio watermarking scheme is blind and robust and is based on DWT transformation. In the proposed scheme, only eight-audio samples of a single frame from two channels, is considered for watermarking process. The details of the embedded and extracted process are shown in the following section.

Embedding process of DWT-based audio watermarking
Input: original audio, watermark; output: watermarked audio. The steps of embedding process are as follows: Step 1: The original audio signal of 16 samples is considered from two channels for further watermark embedding process.
Step 2: DWT transform is applied to obtain an approximate and detailed coefficient of the both channels. Here the approximate and detailed coefficients are low and high pass filter component of the original input signal.
Step 3: Then, binary bits are embedded in the detailed component of an input audio signal. If watermark bit is one then according to ð Þin the first channel (1) where P_1' = detailed component after watermarking, P_1 = detailed component before watermarking, I = intensity factor and if watermark bit is 0, then 2nd channel detailed component is changed with the first channel detailed component. The flow chart of the embedded process is defined in Figure 2 of the audio watermarking.
Step 4: After the completion of the embedding process at both channels, the inverse DWT transform is applied in both channels to get watermarked audio signal.

Extraction process of DWT-based audio watermarking
Input: watermarked audio signal; output: watermark Step 1: Total 16 samples of both channels of the watermarked audio signal as an input is collected with similar steps followed in embedding process.
Step 2: DWT transform is obtained an approximate and detailed components of the both channel.
Step 3: Now the detailed part of the both channels are observed if they are same then our watermarked bit is 0 otherwise it is 1. The flowchart of the extraction process of the audio watermarking is shown in Figure 3.
Step 4: All the watermarked bit into single output to obtain the watermark which was embedded into an audio signal.

Hardware implementation of proposed audio watermarking
The architecture of watermark embedding process is defined in Figure 4. The process comprises of mainly three modules: DWT module, watermark embedding module and inverse DWT module. Initially, the audio samples are stored in Block RAM1 for the processing. The original watermark is also applied to watermark embedding unit. DWT module is used to read the values from the RAM and then converts these values to frequency domain co-efficient.
DWT module has coefficients calculation unit to compute various coefficient for Daubechies filter. After that, these coefficients are processed through watermarking module to insert the watermark. The watermarking module consists of comparator and adder/subtractor to embed the watermark into co-efficient. The comparator is used to take one bit of watermark from block RAM2 and as per bit, the module would decide to embed the watermark of the detailed co-efficient. After the insertion of watermark, the values are fed to inverse integer modules where watermarked audio samples are generated.

Hardware implementation of DWT
The models for executing of the DWT have mainly grouped in two classifications: (1) convolution filter based [15] and (2) the lifting based [15]. The vital discrete wavelet transform (DWT) is frequently refined by a convolution-based filter implementation using the FIR-filters for doing its transform [16]. FIR filters are applicable for improving the execution of the DWT hardware design [17]. Since a lifting structures have points of interest over a convolution-based regarding computation memory usage and complexity, more consideration is paid to the liftingbased approach. Daubechies and Sweldens [15] proposed the new wavelet scheme by taking into account of the second-generation wavelet. The lifting plan has better performance than convolution filter-based DWT. The lifting plan, which altogether depends on the spatial domain, has numerous favorable circumstances contrasted with filter bank structure, for example, lower complexity and power consumption with relatively reduced area. The lifting-based DWT has fundamental part of highpass filter and divide the values in low-pass filters where sequence of upper and lower triangular matrices is being formed [18]. The lifting scheme contains mainly three stages, known as, split, predict (P), and update (U). Each of these steps is shown in Figure 5. The initial step is separating the original values into odd, and even samples, and then after the odd samples are changed to have the prediction and is obtained as the detail coefficients g j + 1 . The even value is represented as coarser adaptation of input of the significant portion from determination. The average value of preserved signal, the detailed coefficients would help to revise the even part. The process carried out in update step that creates f j + 1 approximate coefficients. In order to achieve inverse transform, the sign is going to be exchanged at predict stage and the update stage and all operations are being applied in reversed order as defined in Figure 6.
The main objective is to achieve the lower and upper matrices (triangular type) and normalized diagonal matrix by dividing the polyphase matrix of the wavelet filters [19]. As indicated by the fundamental rule, the lifting filter of polyphase matrix of a 9/7 is defined as in Eq. (2).

ð2Þ
where g(z) and h(z) are high pass and low pass filter and is denied as notation of e (even) and o (odd) part respectively. The value is defined in Eq. (3).

Lifting scheme
Daubechies 9/7-based lifting scheme is shown in Figure 7. Each lifting step comprises one update as well as one predict step and that for second time for 2D implementation as P1, P2, and U1, U2, separately.
Pipelined shift-and-add logic plans multipliers used as a part of proposed DWT algorithm. This methodology diminishes the basic way essentially with little increment in latency and area. The shifter, signed adder and signed subtractor for multiplication process is used. For multiplication, alpha, beta, gamma, delta, multiply with K and divide with K module are discussed in Figure 8. The values are defined in Eq. (5) where, α where, 0 S ≫ n 0 defines the right shift to n bits, where |α| Â S = S +(S ≫ 1) + (S ≫ 4) + (S ≫ 5). The predict step from first lifting generate odd and even contribution after one clock cycle delay. The even value is included with past even input sample (s 0 i , s 0 iþ1 ). Then operation of multiplier using primary filter coefficient is done at delay of the second and third clock cycles by applying only shifting and adding operation. After fourth clock cycle, the result of multiplier is considered at odd input sample (d 0 i ) to update coefficient (d 1 i ). At the end of fifth clock, the present value of predict (d 1 i ) and the past value of the predict (d 1 iÀ1 ) with help of past even info (s 0 i ) provides the first value of update (d 2 i ). The adders is only the required operation at every clock cycle, thus critical path is defined through an adder delay only. The both phase, predict as well as update, of both stages are implemented in full pipelined approach to increase the speed. The overall lifting implementation comprises of four shifters and seven adders/subtractors. Moreover, the second stage of lifting have overall eight shifters and ten adders. The detail process is defined as in Eq. (6).
For an inverse DWT transform we use alpha, beta, gamma, delta module as same as discussed earlier but we use multiply module with the detailed coefficients and divide module with the approximate coefficients then we go reverse order of all the equation and finally we obtained original audio sample. Total eight samples for DWT transform are considered so after the inverse DWT transform eight samples are obtained. All the inverse DWT transform equation steps are discussed under.
The proposed design of DWT and inverse DWT would help to have efficient audio watermarking algorithm (Figure 9).

Simulation and result
The results are initially developed through MATLAB and then hardware implementation are achieved to verify the real-time implementation.

MATLAB
Experiments are performed in MATLAB 2010a. The proposed algorithm uses classical/pop music and speech audio clips in order to evaluate the performance [20]. These are three different types of audio clips are considered as they have different characteristics, perceptual properties and energy distribution. These audio signal have various distinct characteristics and also contains some selective features such as low energy, pulse clarity, pitch (in Hz.), inharmonicity, sampling rate (in Hz.), zero crossing rate (per second), spectral irregularity, temporal length (seconds/sample), tempo (in bpm), rms energy, etc. Each audio sample is of mono file of a 16-bit with sampling rate of 44.1 kHz of WAVE format. The watermark is of binary image of a 30 Â 30 bits as in Figure 10. The synchronization code is a 16-bit Barker code of value "1111100110101110". The wavelet is applied with two decomposition levels. Array size m is 50 and the range of quantization step size Δ starts from 0.15 for speech audio and goes up to 0.6 for pop audio signal. The performance of audio watermarking algorithms is quantified by robustness, payload and inaudibility parameter [21].  The inaudibility is measured using signal to noise ratio (SNR). It is a used to calculate the similarity between distorted watermarked audio signal and undistorted original audio signal. SNR is calculated as in Eq. (10): where f i is original audio signal, whereas f i ' is watermarked audio signal. It helps to calculate the noise induced in the watermark and defines the inaudibility.
Robustness: normalized correlation (NC) measure the similarity between original and extracted is given by: here w is original watermark, w 0 defines the extracted watermark, and i and j are indices to represent the watermark image. Generally, NC is to be considered as equal to 1. The robustness performance is measured using bit error rate (BER) as in Eq. (9).

ð9Þ
The different attacks are considered for the robustness measurement of our proposed algorithm. The detailed of each signal processing attacks are defined and results are defined in i. Echo addition: an echo signal is added (with a decay of 41% and a delay of 98 ms) inside the watermarked audio signal.
j. Denoising: the audio signal with watermark is denoised with function of "automatic click remover" available in Adobe Audition 3.0.
k.Pitch shifting: it is most difficult attack for audio watermarking algorithms, because it tends to shift frequency fluctuation. In the results, the pitch is being shifted around one higher degree and one lower degree. These are applied to all three audio signals are shown in given in Table 1.
The payload data of the proposed algorithm is shown as: The data payload is considered as 220 bps.

Comparison with related work
The general comparison is made between our proposed method and two similar methods [23] and is given in Table 2. As per reported results in Table 2, our proposed algorithm has higher capacity of embedding and lower value of BER. The proposed algorithm may achieve higher performance by reducing payload (which would be achieved by decomposition level increase for wavelet transform or length of array increases). The strength for embedding would increase with that approach.

Hardware results
The architecture is designed and implemented using Verilog HDL and targeting vertex 5 xc5vlx20t-2ff323 FPGA. We synthesized on Xilinx ISE 14.7. Each input and output is defined using IEEE 754 SP format because of complex calculations during DWT and inverse DWT. Tables 3 and 4 provides the hardware utilization of targeted FPGA, and Table 5 has the total computation delay. During synthesis process, the proposed audio watermarking scheme is validated for the real-time performance by FPGA prototyping.

Conclusion
The audio watermarking algorithm is proposed for different audio application. The algorithm uses DWT transform during watermarking process. The proposed algorithm has blind detection and has admirable performance against the attacks. A discrete wavelet transform (DWT) represents a data points regarding wavelet at various frequencies. In this chapter, hardware architecture of DWT-based digital audio watermarking is also developed which is used for real-time applications. Digital audio watermarking used in many applications like ownership protections, Tamper detection and localization, and media forensics. Above analysis shows that proposed algorithm gives good SNR with higher inaudibility and NC is also almost equal to 1. Various attacks are considered to check the robustness of the algorithm. This algorithm produces excellent resistance to many attacks. The FPGA prototyping is done for hardware performance measurement for real-time application.