Open Access is an initiative that aims to make scientific research freely available to all. To date our community has made over 100 million downloads. It’s based on principles of collaboration, unobstructed discovery, and, most importantly, scientific progression. As PhD students, we found it difficult to access the research we needed, so we decided to create a new Open Access publisher that levels the playing field for scientists across the world. How? By making research easy to access, and puts the academic needs of the researchers before the business interests of publishers.
We are a community of more than 103,000 authors and editors from 3,291 institutions spanning 160 countries, including Nobel Prize winners and some of the world’s most-cited researchers. Publishing on IntechOpen allows authors to earn citations and find new collaborators, meaning more people see your work not only from your own field of study, but from other related fields too.
Synchronization issue is inevitable in all wireless communication receiver systems and it plays the key role to the system performance. Synchronization technique includes timing synchronization and frequency synchronization. Timing synchronization is to detect valid packet and the accurate start position of fast Fourier transform (FFT) window from noise. Frequency synchronization is to correct the phase error caused by the mismatch of local oscillator (LO) between transmitter and receiver.
Synchronization technique has been extensively studied for years. Although UWB system can leverage on successful experiences of orthogonal frequency division multiplexing (OFDM), it cannot use the traditional synchronization technology directly due to the distinct features. In IEEE 802.15.3a standard, the specified emission power spectral density is only -41 dBm/MHz, which is extremely small compared with other wireless systems. It indicates that timing synchronization for UWB system should be robust in high noise environment. In addition, to satisfy 528 Msps throughput, the UWB baseband receiver system should be designed in parallel architecture. The inherent high complexity, the requirements of high performance, high speed, low cost and low power consumption make the design of synchronization blocks for UWB quite a challenge work.
This chapter will be divided into three parts: timing synchronization, coarse frequency synchronization and fine frequency synchronization. The traditional algorithms and innovative methods with low complexity and good performance will be introduced. Architecture design of each part is also provided.
As soon as the receiver starts up, it searches for the presence of OFDM-based UWB packet in the received signals. Usually, packet detection can only acquire the rough timing information by exploiting the repetition in the received signal.The accurate timing information, such as the symbol boundary or the start position of FFT window, is necessary, which relies on matching the received waveform with the preamble waveform by a matched filter.
2.1. Effects of timing offset
Assume the channel maximum delay is shorter than the guard interval; the position of FFT window can have several situations, as shown in Fig. 1. The exact start position of FFT window is at the boundary of region B and C. If the start position is in region B, the signals in FFT window will not be contaminated by the previous symbol and thus no inter-symbol interference (ISI) occurs. The only effect is introducing phase shift. After demodulation, the received signal with timing offset in region B is expressed in (1).
Rk,l=Sk,lHk,le−j2πΔn/N+Wk,lE1
where Sk,l, Hk,land Wk,lare the transmitted signal, channel impulse response (CIR) and the noise signal respectively at the k-th subcarrier and the l-th symbol in frequency domain.Δn is defined as the delayed samples to the correct FFT window position.
Figure 1.
The scenario of timing offset
When the FFT window leads or lags by a large degree, such as in region A or C, ISI will be introduced and both the magnitude and the phase of the received signal will be distorted, as shown in (2).
Rk,l=Sk,lHk,le−j2πΔn/NN−ΔnN+Wk,l+WISIE2
where WISI is the introduced ISI noise. Due to the introduced ISI and the phase rotation, there is slight magnitude attenuation in the signal.
2.2. Timing synchronization algorithms
Timing synchronization can be divided into two categories: coarse timing synchronization and fine timing synchronization. Coarse timing synchronization is usually based on auto-correlation (AC), while fine timing synchronization is based on cross-correlation (CC). The traditional algorithms of AC, maximum likelihood (ML), minimum mean square error (MMSE) and CC will be introduced.
Auto-correlation
The AC algorithm (Schmidl & Cox, 1997) for coarse timing synchronization is quite straightforward. It searches for the repetition in the received signal with a correlator and a maximum searcher. Let the repetition interval length be denoted as L. rn is the received signal in time domain. The timing metric can be defined as
M(n)=|∑k=0L−1rn+k*rn+k+L|2(∑k=0L−1|rn+k+L|2)2E3
where * is the conjugated operation. The estimated time index of the maximum M(n) can be expressed as
n^=argmaxnM(n)E4
If the maximum M(n) is over the threshold, the packet is presented and the estimated timing index is the symbol boundary. The drawback of this scheme is when the correlation window moves away from the repeated period, the power of timing metric M(n) may not fall off as expected, especially in low signal-to-noise ratio (SNR). In this case, there may be a large error in the detected symbol boundary.
σs2/ σn2 is SNR. The estimated symbol boundary is derived by searching the maximum output of ML function. The complexity of ML is quite high because the estimation of SNR is difficult and the errors in SNR estimation will make the system less reliable.
Minimum mean square error
MMSE metric (Minn et al., 2003) is equivalent to a special case of the ML metric with ρ = 1. It shows almost the same timing estimation performance as ML. The principle is to search the minimum output of the metric, as shown in (7).
For AC, ML and MMSE algorithms, when the preamble has more than two identical segments, there will be a plateau or a wide basin in the correlator output waveforms. Theoretically, the plateau or basin indicates the ISI-free region for FFT window. However, noise in the received signal may cause the max/min to drift away from the optimal point. So AC, ML and MMSE are the methods to detect packet coarsely and the detection of accurate symbol boundary or FFT window needs fine timing synchronization, such as CC.
Cross-correlation
CC is the mechanism for fine timing synchronization. Instead of correlating the noisy received waveform with its delayed version, CC is defined as correlating the received signal with preamble waveform (Fort et al., 2003). It can fit into the low SNR situation and can be expressed as
M(n)=∑k=0Q−1rn+kck*E8
where ck is the preamble coefficients and Q is the length of preamble.
Dual-threshold detection
Dual-threshold (DT) detection scheme is based on the idea in (Fan et al., 2009) for OFDM-based UWB system. Fig. 2 shows the block diagram of the DT detection scheme. The signal detection process is divided into two steps. The first step is based on CC algorithm. Express the peak CC energy of each symbol as
T1=maxn|An|2=maxn|∑k=0N−1r[n+k]c*[k]|2E9
∑k=0N−1c2[k]=NE10
where An is the moving sum of CC value; c[k] is the preamble coefficients; r[k] is the received signal and N is the FFT size. If the peak CC energy T1is over the first threshold, the estimated sample index of symbol boundary and the moving sum will be stored in FIFO for further use by the following auto-correlator. Otherwise, the peak CC energy of the next symbol will be calculated.
Figure 2.
Block diagram of DT detection scheme
The second step is to read the moving sum from FIFO and auto-correlating with its delayed version. The energy of the cascaded auto-correlator can be derived as
T2=|An^An^+6M*|2E11
where M is the repeated preamble interval length of UWB system. The delay interval of the auto-correlator is decided by the period of time frequency code (TFC). In order to ensure the moving sum and its delayed version are in the same band no matter what kind of TFC mode is adopted, the delay interval is set to six-symbol length. If the output energy of cascaded auto-correlator T2 is over the second threshold, the packet is detected. Otherwise, fetch the next value in FIFO and calculate T2 again.
Figure 3.
Output waveforms of ML and MMSE algorithms at 10 dB SNR
Figure 4.
Output waveforms of CC and DT algorithms at 10 dB SNR
Fig. 3 depicts the output waveforms of ML and MMSE algorithms at 10 dB SNR. There are plateaus and basins in the output waveforms of ML and MMSE, which make the peak energy ambiguous. It is much easier to find accurate timing information in the output waveform of CC in Fig. 4. However, there are glitches in CC output waveform, which will corrupt the detection of symbol boundary and increase the false alarm probability. The waveform of DT has much lower noise floor compared with CC and there is not any glitch.
2.3. Architecture of the matched filter
Matched filter is the basic component in timing synchronization for detecting a known piece of signal in noise. The architecture of mated filter determines the complexity and the powerconsumption of the timing synchronizer. An optimum architecture of the matched filter for OFDM-based UWB is provided, as shown in Fig. 5. To satisfy 528 Msps throughput, the baseband receiver system of UWB is designed at 132 MHz clock frequency with four parallel paths and twelve-level pipelines. For low complexity, both the received signal and the preamble coefficients are truncated to sign-bit. In this case, five-bit multipliers can be replaced with NXOR gates. In addition, the 128 sign-bits of preamble coefficients are generated by spreading a 16 sign-bit sequence with an 8 sign-bits sequence as follows
sgn(c16(j−1)+i)=ai×bji=1,2,...,16j=1,2,...,8E12
where ai and bj are 1 or -1. According to (12), the 128 taps matched filter can be decomposed to 16 taps cascaded with 8 taps, as shown in Fig. 5. With the decomposition, the processing period of the matched filter can be reduced to 19% and the length of the circle shift register can be reduced to 20. In CC operation, if the shift register is full, shift the data from address of [5:20] to [1:16] and save the coming four sign-bits to the address of [17:20]. The data with the addresses of [1:16], [2:17], [3:18] and [4:19] are distributed to four parallel data paths and cross-correlated with the coefficients ai. This optimum architecture of the matched filter not only guarantees the high speed, but also reduces the cost of the hardware.
OFDM-based UWB system is sensitive and vulnerable to carrier frequency offset (CFO), which can be estimated and compensated by coarse frequency synchronization in time domain. Due to the Doppler Effect, even very small CFO will lead to very serious accumulated phase shift after a certain period.
3.1. Effects of carrier frequency offset
Define the normalized CFO, εf= Δf/fs, as the ratio of CFO to subcarrier frequency spacing. The received signal with CFO in frequency domain can be expressed as (Moose, 1994)
where Sk,l, Hk,l and Wk,lstand for the transmitted signal, channel impulse response and noise respectively at k-th subcarrier and l-th symbol. WICI is the noise contributed by inter-carrier interference (ICI). ICI will not only destroy the orthogonality of the subcarriers in OFDM-based UWB system, but also degrade SNR. The SNR degradation can be approximated as (Pollet et al., 1995)
DSNR≈103ln10(πεf)2EsNoE14
where Es/No is the ratio of symbol energy to noise power spectral density.
3.2. Frequency synchronization algorithm
The most straightforward frequency synchronization algorithm is based on AC functions. CFO can be estimated by the phase difference between two symbols. For traditional OFDM system, the CFO can be estimated as
ε^f=N2πMtan−1(∑k=0N−1rn+k*rn+k+M)E15
where N is the FFT size and M is the interval of two symbols. If apply traditional AC algorithm in UWB system, the sliding window length (SWL) is 128. The four-parallel architecture with 128 SWL will be in high complexity. Shortening the SWL can reduce the complexity with degradation of the estimation performance. To improve the performance with low complexity, an optimized AC algorithm is provided by shortening the SWL to 64 and making a sum average over three symbols located at three different subbands, as expressed in (16).
where L denotes the SWL of each symbol. The values of Gi (i = 1,2) depend on TFC. If TFC is {1 2 3 1 2 3} or {1 3 2 1 3 2}, G1 = 3, G2 = 1; if TFC is {1 1 2 2 3 3} or {1 1 3 3 2 2}, G1 = 1, G2 = 2.
Although the SWL can be further reduced for lower complexity, the performance degradation requires a much longer period sum average to compensate. Tradeoff in complexity, performance and the processing period, L = 64 is the best choice. Fig. 6 shows the MSE performance comparison with different SWL. The normalized CFO is set to 0.01. Due to the sum average over three subbands, the optimized AC algorithm with SWL 64 has better performance than the traditional AC algorithm with SWL 128. The optimized AC algorithm with SWL 32 cannot perform as good as traditional AC algorithm with SWL 128. It needs longer period for sum average to compensate the performance degradation.
For UWB, the CFO compensation algorithm can be optimized as well. The basic idea is to take the CFO values on four-parallel paths as the same if the differences of the four CFO values are very small (Fan & Choy, 2010a). In the specification of UWB, the center frequency is about 4 GHz and the maximum impairment at clock synthesizer is ±20 ppm (parts per million). Therefore, the normalized CFO should be less than 0.04. And the maximum CFO difference between any two parallel samples should be less than 2.5 × 10-4, which is small enough and can be ignored. The optimized CFO compensation scheme can be expressed as
where 4(m-1)+q is the sample index. The optimum CFO compensation strategy not only reduces the four-parallel digital synthesizer to one, but also alleviates the workload of the phase accumulator.
Figure 6.
MSE performance comparison with different SWL
3.3. Implementation of frequency synchronizer
The design of frequency synchronizer is divided into two parts. The first part is to estimate the phase difference between two preambles by AC and arctangent calculation. The second part is to compensate the signals by multiplying a complex rotation vector. In this part, the phase accumulator and sin/cos generator are involved.
Fig. 7 shows the architecture of CFO compensation block. The phase accumulator produces a digital weep with a slope proportional to the input phase. The phase offset is scaled from [0, 2π] to [0, 8] by multiplying a factor 4/π, so that just the three most significant bits (MSBs) can be used to control the phase offset regions. During CFO compensation, the sine and cosine values of the phase offset in the range of [0,π/4] are necessary to be calculated. If the phase offset is in other ranges, input complement, output complement or output swap are operatedcorrespondingly.
In the design of frequency synchronizer, implementation of arctangent, sine and cosinefunctions is the most critical work since it decides the complexity of the synchronizer and the performance of the UWB receiver system. The traditional OFDM-based or CDMA-based systems usually employed classic coordinate rotation digital computer (CORDIC) algorithm for function evaluation (Tsai & Chiueh, 2007; Troya et al., 2008). Actually, there are other techniques for function evaluation, such as polynomial hyperfolding technique (PHT) (Caro et al., 2004), piecewise-polynomial approximation (PPA) technique (Caro & Steollo, 2005), hybrid CORDIC algorithm (Caro et al., 2009) and multipartite table method (MTM) (Caro et al., 2008).
Figure 7.
Architecture of the CFO compensation block
Polynomial hyperfolding technique
PHT calculates sine and cosine functions using an optimized polynomial expression with constant coefficients. The sine and cosine functions can be expressed by polynomial expressions of degree K.
where 0 ≤x< 1 is the scaled input of sine and cosine functions. Optimization is conducted on two-order (K = 2) and three-order (K = 3) approximated polynomials, expressed as (19) and (20) respectively (Caro et al., 2004). The two-order PHT can achieve about 60 dBc spurious free dynamic range (SFDR) while the three-order PHT can achieve 80 dBc SFDR.
The technique of PPA is based on the idea of subdividing the interval in shorter subintervals. Polynomials of a given degree are used in each subinterval to approximate the trigonometric functions. The signal x represents the input phase scaled to a binary fraction in the interval of [0, 1], which is subdivided in s subintervals, with s = 2u. The u MSBs of x encode the segment starting point xk and are used as an address to the small lookup tables that store polynomial coefficients. The remaining bits of x represent the offset x–xk. The quadratic PPA of sine and cosine functions can be expressed as (Caro & Steollo, 2005)
Fig. 9 shows the architecture of sine and cosine blocks with PPA.Use r bits and t bits for the first-order and the second-order coefficients quantization respectively. The constant coefficients are (Q– 1) bits. The input and output of the sine and cosine functions are represented by P bits and Q bits. The constant, linear and quadratic coefficients are read from ROMs to conduct polynomial calculation. The partial products are generated by the PPGen block to compute linear terms. And the carry-save addition tree adds the partial products together after aligning all the bits according to their weights.
Figure 8.
Architecture of sine and cosine blocks with PPA (Caro & Steollo, 2005)
Hybrid coordinate rotation digital computer
This approach splits the phase rotation in three steps. The first two steps are CORDIC-based with computing the rotation directions in parallel. The final step is multiplier-based (Caro et al., 2009).
Suppose theword length of input vector [Xin, Yin] and output vector [Xout, Yout] are 12 and 13 bits respectively. Represent the rotation phase φ∈ [0, π/4] with a binary fractional value in [0, 1] as
4πφ=f12−1+f22−2+⋯+f132−13E22
The least significant bit (LSB) of φ has a weight that will be indicated in the following as φLSB = (π/4)2-13. In the first step, the phase is divided in two subwords φ = α + β, where
The goal of the first stage is to perform a rotation by an angle close to α + φLSB/2. To that purpose, the first rotation uses CORDIC algorithm can be described by the following equations.
where σiis equal to the sign of Zi. The algorithm starts with X1 = Xin, Y1 = Yin and Z1 = α + φLSB/2.
The second and third stages rotate the output vector of the first stage by a phase γ = Zresidual + β, which is represented with 11 bits. γis then split as the sum of two subwords γ1+ γ2, where
And the operation to be performed in the final rotation block can be written as
{Xout=XT2cosγ2−YT2sinγ2Yout=XT2sinγ2+YT2cosγ2E28
where [XT2, YT2] is the output vector of the second rotation. The absolute value of γ2is smaller than 2-6. Therefore, sine and cosine functions can be approximated as sinγ2≈γ2and cosγ2≈ 1.
The architecture of hybrid CORDIC rotator is shown in Fig. 10. The elementary stage is composed with adders and shifters. The two final vector merging adders (VMAs) convert the results to two’s complement representation.
Figure 9.
Architecture of hybrid CORDIC technique (Caro et al., 2009)
Multipartite table method
MTM is a very effective lookup table compression technique for function evaluation. It has been found ideally suited for high performance synthesizer, requiring both very small ROM size and simple arithmetic circuitry (Caro et al., 2008). The principle of MTM is to decompose Q-bit input signal x in K + 1 non-overlapping sub-words: x0, x1, …, xK with lengths of q0, q1, …, qK respectively, where x = x0 + x1 + … + xK and Q = q0 + q1 + … + qK. The angle [0, π/4] is scaled to a binary fraction in [0, 1]. A piecewise linear approximation of f(x) can be expressed as
The interval of x has been divided in 2q0 subintervals. x0 represents the starting point of each subinterval and x1 + … + xK is the offset in each interval between x and x0.α1 is a sub-word of x0 including its p1 ≤ q0 MSBs. Likewise,αi (i = 2... K) is a sub-word of x0 including itspi ≤ pi - 1. The term A(x0) can be realized with a ROM, which is named as table of initial values (TIV), with 2q0 entries. And the terms B(αi) xi(i = 1…K) can be implemented with K ROMs, which is named as table of offsets (TOi), with 2pi+qi entries each.Making the TOs symmetric, the size of ROMs can be reduced by a factor of two. Then, the equation (29) becomes
f(x)≃A˜(x0)+B1(α1)(x1−δ12)+⋯+BK(αK)(xK−δK2)E30
where the coefficients can be calculated as follows (Caro et al., 2008).
The architecture of MTM with symmetric TOs is shown in Fig. 11. The content of TOs is conditionally added or subtracted from the content stored in TIV. The addition or subtraction of the content in ROMs and complement operation of the inputs are controlled by the MSB of each subword.
Figure 10.
Architecture of MTM with symmetric TOs
In order to give a fair comparison of the four techniques, they are used to implement CFO compensation block. The parameters of the design are set to make the SFDR of the four techniques nearly the same. The inputs and outputs of the four algorithms are 12 bits. Synthesized with UMC 0.13 μm high speed library at 132 MHz clock frequency, the power, area and latency of the four methods are listed in Table 1. MSE is a statistical value, so it is not easy to set the MSEs of the four approaches exactly the same. But they are very closed. With the smallest MSE, MTM outperforms other algorithms in area, power and latency. Since MTM is proved to be an efficient approach for function evaluation, it can be applied to implement arctangent fucntion in CFO estimation block.
Technique
MTM
PPA
PHT
Hybrid CORDIC
Design parameter
q0 = 4 q1 = 2 q2 = 3 q3 = 3 p1 = 3 p2 = 3 p3 =1
s = 64 r = 6 t = 7
K = 3
(1) 4 rep. (2) 3 rep. (3) 8b × 8b
MSE (×10-7)
2.97
4.91
7.82
5.73
Area (mm2)
0.018
0.027
0.031
0.146
Power (mW)
0.84
0.88
1.55
13.93
Latency (Clock cycs.)
3
3
4
6
Table 1.
Synthesis performance comparison of CFO compensation with four techniques
Although CFO can be coarsely estimated by frequency synchronizer in time domain, the residual CFO (RCFO), sampling frequency offset (SFO) and common phase error will lead to accumulated phase shift after a certain period and thus degrade the system performance if they are not carefully tracked. In OFDM-based UWB systems, pilot subcarriers can help to solve the residual phase distortion issue in frequency domain, which is also called fine frequency synchronization.
4.1. Effects of sampling frequency offset
The oscillators used to generate the DAC and ADC sampling instants at the transmitter and receiver will never have exactly the same period. Thus, the sampling instants slowly shift relative to each other.The SFO has two main effects: a slow shift of the symbol timing, which rotates subcarriers; and a loss of SNR due to the ICI generated by the slightly incorrect sampling instants, which causes loss of the orthogonality of the subcarriers.
Define the normalized sampling error as Δt = (T’ - T)/T, where T’ and T are the receiver and transmitter sampling periods respectively. Then the overall effect on the received signal in frequency domain is expressed as
where Ts and Tu are the duration of the total symbol and the useful data respectively. Wk,l is additive white Gaussian noise (AWGN)and the last term NΔt(k, l) is the additional interference due to the SFO. The power of the last term is approximated by
PΔt≈π23(kΔt)2E33
Hence the degradation grows as the square of the produce of the offset Δt and the subcarrier indexk. This means that the outermost subcarriers are most severely affected. The degradation can also be expressed directly by SNR loss as (Pollet et al., 1995)
Dn≈10log10(1+π23EsN0(kΔt)2)(dB)E34
The OFDM-base UWB system does not have a large number of subcarriers and the value of Δt is quite small. So kΔt<< 1, and the interference caused by SFO can usually be ignored.However, the term showing the amount of rotation angle experienced by the different subcarriers will lead to serious problem. Since the rotated angle depends on both the subcarrier index and symbol index, the angle is the largest for the outermost subcarrier and increases with the consecutive symbols. Although Δt is very small, with the increasing of the symbol index, the phase shift will eventually corrupt the demodulation. In this case, tracking SFO is necessary.
4.2. Phase tracking algorithms
Conventionally, SFO can be estimated by computing a slope from the plot of pilot subcarrier differences versus pilot subcarrier indices (Speth et al., 2001). Recently, joint estimation of CFO and SFO has also been studied extensively, such as the linear least squares (LLS) algorithm (Liu & Chong, 2002) and joint weighted least squares (WLS) algorithm (Tsai et al., 2005).
Auto-correlation
The reveived signal with residual phase distortion in frequency domain after removing the channel noise can be modeled as
Zk,l=Sk,lPk,l=Sk,lexp(jΦk,l)=Sk,lexp(j(αk+βl))E35
where Pk, l is the phase distortion vector and Φk, lis the residual phase error. The relationship of α, βl and Φk, l is shown in Fig. 12. α is the slope of the phase distortion and is contributed by SFO. βl is the intercept of phase distortion and is caused by RCFO of symbol l.
The basic idea of AC is to get the phase differences of pilot subcarriers between two symbols.
Figure 11.
The relationship of phase distortion and subcarriers
The pilot subcarriers are divided into two parts, C1 and C2. C1 is on the left of the spectrum, and C2 is on the right of the spectrum. Then the estimated intercept phase βl and the slope α are written as (Speth et al., 2001)
Applying LLS estimation to (37) with K pilots in one symbol, each pilot is located at the subcarrier of ki. The RCFO and SFO estimation yield (Liu & Chong, 2002)
Such an estimation algorithm that is based on the phase differences between two symbols can remove the common channel fading terms in slow-fade scenarios. Consequently, this estimation scheme can be applied before channel estimation and equalization.
Weighted least squares
Though the joint LLS estimation algorithm provides accurate estimation results in the AWGN channel, diverse channel responses on the pilot subcarriers can render its estimation useless. For instance, phase of several deeply faded pilot subcarriers, when employ the estimation of the joint LLS, can lead to a large error in the estimation results. On the other hand, the phases of those subcarriers with little fading are naturally more reliable. Therefore, weighting the subcarrier data is advantageous, and data of serious faded subcarriers should be assigned smaller weights to minimize their adverse effect on estimation accuracy. The WLS algorithm for joint estimation of RCFO and SFO can be expressed as (Tsai et al., 2005)
The weight ωi should be inversely proportional to the variance of phase error, which depends on noise, ICI and the complex channel gain. Usually, the residual synchronization error is so small that the ICI term can be neglected and ωi only depends on the channel gain of the pilot subcarriers. The disadvantage is this algorithm is very complicated, especially the computation of the parameter of ωi. Without estimating the ωi accurately, there will be large error in phase tracking.
Novel approach for UWB
In traditional phase tracking solutions, arctangent, sine and cosine functions are necessary, which are quite complicated in hardware implementation. The algorithm presented in (Troya et al., 2007) simplifies the hardware cost significantly compared with the traditional approaches. However, it sacrifies system performance slightly. In (Fan & Choy, 2010b), a novel phase tracking method for UWB is proposed. It not only has low complexity, but also improves the performance.
Considering the condition |αk|<<1 is satisfied with k ∈ [-55, 55], the first order approximation can be made as cos(αk) ≈ 1 and sin(αk) ≈ αk. Then the phase distortion in (35) can be rewritten as
Pk,l=cosβl−αk⋅sinβl+j(sinβl+αk⋅cosβl)E42
In (42), four parameters are of interests: sinβl, cosβl, α sinβl and α cosβl. The former two can be easily obtained by
whereℜ(⋅)andℑ(⋅)denote the real and imaginary part respectively. There are 12 pilots in each symbol of OFDM-based UWB system. Since 1/8 is much easier to implement than 1/12 and the pilots near DC subcarrier suffer more channel noise than the ones far away from DC subcarrier, the pilots outermost should be used as many as possible.
Approximating the scaling factor 1/260 to 1/256, which can be easily implemented by 8-bit right-shifting, the parameters of α sinβl and α cosβl are given by
In the traditional algorithms, although LLS and WLS algorithms have better phase tracking performance than AC, they have very high complexity for practical application. For hardware implementation, AC is in low complexity and moderate phase correction performance. Therefore, the MSE performance of the novel approach for UWB and AC are compared in different phase dostrotion conditions, as shown in Fig. 13. Obviously, the novel phase tracking method for UWB has much better proformance than the traditional AC algorithm. In addition, with the increasing of phase error, the traditional AC algorithm degrades seriously, which is not associated with the novel method.
Figure 12.
MSE performance comparison between traditional AC algorithm and the novel approach for UWB
4.3. Architecture of the phase tracking block
The architecture of phase tracking block with the novel approach for UWB is shown in Fig. 14. The signals after channel equalization are stored in pilots buffer and data buffer separately. Considering that the transmitted pilots are known and have the modulus of one, the phase error vector of the pilots can be derived by multiplying the conjugation of transmitted pilots. As shown in Fig. 14, no arctangent, sine or cosine function appeared, they are replaced by eight complex adders and two complex shifters.
Figure 13.
Highly simplified architecture of the phase tracking block
The values of parameters α sinβl and α cosβl are very small, so the phase errors contributed by SFO of four parallel data can be approximately thought the same, rewritten as α⌈k/4⌉sinβl and α⌈k/4⌉cosβl (⌈k/4⌉∈ [-12, 12]). Calculating the parameters of 4α sinβl and 4α cosβl instead of α sinβl and α cosβl further simplifies the architecture of phase tracking block.
This chapter provides a compreshensive review of the algorithms and architectures for timing and frequency synchronization. Although there are many literatures on UWB synchronization techniques, most of them do not take the real application or implementation into account. This chapter introduces three parts of the synchronizaiton progress.
In timing synchroniztion, DT detection scheme improves the detection performance significantly due to the cascaded auto-correlator. Although it meanwhile increases the hardware cost slightly, the optimum architecture of the matched filter with low complexity can save the hardware. In coarse frequency synchronization, the CFO estimation approach can be simplified by shortening the SWL and the sum average over three subbands will compensate the SNR degradation. MTM is proved to be a low cost, low power and high speed approach to implement arctangent, sine and cosine functions compared with other function evaluation techniques. In fine frequency synchronization, a novel phase tracking approach for UWB is proposed for good performance. Additionaly, there is not any arctangent, sine or cosine intensive computation unit appeared and they are replaced by adders and shifters, which indicates that the implementation complexity of the novel phase tracking method is low.
The low compxity and power efficent synchronization techniques provide possibilities of developing the robust, low cost, low power and high speed OFDM-based UWB receiver.
References
1.CaroD. D.NapoliE.SteolloA. G. M.2004 Direct digital frequency synthesizer with polynomial hyperfolding technique. IEEE Transactions on Circuits and Systems II, Express Briefs, 517 Jul. 2004, 337344 , 1549-7747
2.CaroD. D.SteolloA. G. M.2005 High-performance direct digital frequency synthesizer using piecewise-polynomial approximation. IEEE Transactions on Circuits and Systems I, Regular Papers, 522 Feb. 2005, 324337 , 1549-8328
3.CaroD. D.PetraN.SteolloA. G. M.2008Reducing lookup-table size in direct digital frequency synthesizers using optimized multipartite table method. IEEE Transactions on Circuits and Systems I, Regular Papers, 557 Aug. 2008, 21162127 , 1549-8328
4.CaroD. D.PetraN.SteolloA. G. M.2009Digital synthesizer/mixer with hybrid CORDIC-multiplier architecture : error analysis and optimization.IEEE Transactions on Circuits and Systems I, Regular Papers, 562 Feb. 2009, 364373 , 1549-8328
5.CoulsonA. J.2001Maximum likelihood synchronization for OFDM using a pilot symbol : algorithms. IEEE Journal on Selected Areas in Communications, 1912 Dec. 2001, 24862494 , 0733-8716
6.FanW.Choy-SC.Leung-NK.2009Robust and low complexity packet detector design for MB-OFDM UWB. Proceedings of IEEE Int. Symposium on Circuits and Systems, 693696
7.FanW.Choy-SC.2010a Power efficient and high speed frequency synchronizer design for MB-OFDM UWB. Proceedings of IEEE Int. Conference on UWB, 669673
8.FanW.Choy-SC.2010bEfficient and low complexity phase tracking method for MB-OFDM UWB receiver. Proceedings of IEEE Midwest Symposium on Circuits and Systems, 221224
9.FortA.WeijersJ. W.DerudderV.et al.2003 A performance and complexity comparison of auto-correlation and cross-correlation for OFDM burst synchronization. Proceedings of IEEE Int. Conference on Acoustics, Speech, and Signal Processing, 341344
10.LiuS. Y.ChongJ. W.2002 A study of joint tracking algorithms of carriers frequency offset and sampling clock offset for OFDM-based WLANs. Proceedings of IEEE Int. Conference on Communications, Circuits and Systems and West Sino Expositions, 109133 .
11.MinnH.BhargavaV. K.LetaiefK. B.2003A robust timing and frequency synchronization for OFDM systems. IEEE Transactions on Wireless Communications, 24 Jul. 2003, 822839 , 1536-1276
12.MooseP. H.1994A technique for orthogonal frequency division multiplexing frequency offset correction. IEEE Transactions on Communications, 4210 Oct. 1994, 29082914 , 0090-6778
13.PolletT.Van BladelM.MoeneclaeyM.1995BER sensitivity of OFDM systems to carrier frequency offset and Wiener phase noise. IEEE Transactions on Communications,432 Mar. ~ Apr. 1995, 191193 , 0090-6778
14.SchmidlT. M.CoxD. C.1997Robust frequency and timing synchronization for OFDM. IEEE Transactions on Communications,4512 Dec. 1997, 16131621 , 0090-6778
15.SpethM.FechtelS.FockG.et al.2001 Optimum receiver design for OFDM-based broadband transmission-part II: a case study. IEEE Transactions on Communications,494 Apr. 2001, 571578 , 0090-6778
16.TroyaA.MaharatnaK.KrsticM.et al.2007 Efficient inner receiver design for OFDM-based WLAN systems: algorithm and architecture. IEEE Transactions on Wireless Communications,64 Apr. 2007, 13741385 , 1536-1276
17.TroyaA.MaharatnaK.KrsticM.2008 Low-power VLSI implementation of the inner receiver for OFDM-based WLAN systems. IEEE Transactions on Circuits and Systems I, Regular Papers, 552 Mar. 2008, 672686 , 1549-8328
18.Tsai-YP.Kang-YH.Chiueh-DT.2005 Joint weighted least squares estimation of carrier frequency offset and timing offset for OFDM systems over multipath fading channel. IEEE Transactions on Vehicular Technology, 541 Jan. 2005, 211224 , 0018-9545
19.Tsai-YP.Chiueh-DT.2007 A low-power multicarrier-CDMA downlink baseband receiver for future cellular communication systems. IEEE Transactions on Circuits and Systems I, Regular Papers, 5410 Oct. 2007, 22292239 , 1549-8328
20.Van de BeekJ. J.SandellM.BorjessonP. O.1997 ML estimation of time and frequency offset in OFDM systems. IEEE Transactions on Signal Processing, 457 Jul. 1997, 18001805 , 0105-3587X
Written By
Wen Fan and Chiu-Sing Choy
Submitted: October 21st, 2010Published: July 27th, 2011