Open access

Applications of a Combination of Two Adaptive Filters

Written By

Tonu Trump

Submitted: 10 February 2012 Published: 20 February 2013

DOI: 10.5772/50451

From the Edited Volume

Adaptive Filtering - Theories and Applications

Edited by Lino Garcia Morales

Chapter metrics overview

2,374 Chapter Downloads

View Full Metrics

1. Introduction

Designing a Least Mean Square (LMS) family adaptive algorithm includes solving the well-known trade-off between the initial convergence speed and the mean-square error in steady state according to the requirements of the application at hands. The trade-off is controlled by the step-size parameter of the algorithm. Large step size leads to a fast initial convergence but the algorithm also exhibits a large mean-square error in the steady state and in contrary, small step size slows down the convergence but results in a small steady state error [9,17]. In several applications it is, however, eligible to have both and hence it would be very desirable to be able to design algorithms that can overcome the named trade-off.

Variable step size adaptive schemes offer a potential solution allowing to achieve both fast initial convergence and low steady state misadjustment [1, 8, 12, 15, 18]. How successful these schemes are depends on how well the algorithm is able to estimate the distance of the adaptive filter weights from the optimal solution. The variable step size algorithms use different criteria for calculating the proper step size at any given time instance. For example the algorithm proposed in [15] changes the time-varying convergence parameters in such a way that the change is proportional to the negative of gradient of the squared estimation error with respect to the convergence parameter. Squared instantaneous errors have been used in [12] and the squared autocorrelation of errors at adjacent time instances in [1] to modify the step size. In reference [18] the norm of projected weight error vector is used as a criterion to determine how close the adaptive filter is to its optimum performance.

More recently there has been an interest in a combination scheme that is able to optimize the trade-off between convergence speed and steady state error [14]. The scheme consists of two adaptive filters that are simultaneously applied to the same inputs as depicted in Figure 1. One of the filters has a large step size allowing fast convergence and the other one has a small step size for a small steady state error. The outputs of the filters are combined through a mixing parameter . The performance of this scheme has been studied for some parameter update schemes [2, 6, 19]. The reference [2] uses convex combination i.e. is constrained to lie between 0 and 1. The reference [19] presents a transient analysis of a slightly modified version of this scheme. The parameter is in those papers found using an LMS type adaptive scheme and computing the sigmoidal function of the result. The reference [6] takes another approach computing the mixing parameter using an affine combination. This paper uses the ratio of time averages of the instantaneous errors of the filters. The error function of the ratio is then computed to obtain .

In [13] a convex combination of two adaptive filters with different adaptation schemes has been investigated with the aim to improve the steady state characteristics. One of the adaptive filters in that paper uses LMS algorithm and the other one Generalized Normalized Gradient Decent algorithm. The combination parameter is computed using stochastic gradient adaptation. In [24] the convex combination of two adaptive filters is applied in a variable filter length scheme to gain improvements in low SNR conditions. In [11] the combination has been used to join two affine projection filters with different regularization parameters. The work [7] uses the combination on parallel binary structured LMS algorithms. These three works use the LMS like scheme of [5] to compute .

It should be noted that schemes involving two filters have been proposed earlier [3, 16]. However, in those early schemes only one of the filters have been adaptive while the other one has used fixed filter weights. Updating of the fixed filter has been accomplished by copying of all the coefficients from the adaptive filter, when the adaptive filter has been performing better than the fixed one.

In this Chapter we compute the mixing parameter from output signals of the individual filters. The way of calculating the mixing parameter is optimal in the sense that it results from minimization of the mean-squared error of the combined filter. The scheme was independently proposed in [21] and [4]. In [23], the output signal based combination was used in adaptive line enhancer and in [22] it was used in the system identification application.

We will investigate three applications of the combination: system investigation, adaptive beamforming and adaptive line enhancer. We describe each of the applications in detail and present a proper analysis.

We will assume throughout the Chapter that the signals are complex-valued and that the combination scheme uses two LMS adaptive filters. The italic, bold face lower case and bold face upper case letters will be used for scalars, column vectors and matrices respectively. The superscript T denotes transposition and the superscript H Hermitian transposition of a matrix. The operator E[·] denotes mathematical expectation, Re{·} is the real part of a complex variable and Tr[·] stands for the trace of a matrix.

Advertisement

2. Combination of Two Adaptive Filters

Let us consider two adaptive filters, as shown in Figure 1, each of them updated using the LMS adaptation rule

e i ( n ) = d ( n ) w i H ( n 1 ) x ( n ) , E1
w i ( n ) = w i ( n 1 ) + μ i e i ( n ) x ( n ) . E2

In the above wi(n) is the N vector of coefficients of the i-th adaptive filter, with i = 1,2 and x(n) is the known N input vector, common for both of the adaptive filters. The input process is assumed to be a zero mean wide sense stationary Gaussian process. μ i is the step size of i-th adaptive filter. We assume without loss of generality that μ 1 > μ2. The case μ 1 = μ2 is not interesting as in this case the two filters remain equal and the combination renders to a single filter.

Figure 1.

The combined adaptive filter.

The desired signal in 1 can be expressed as

d ( n ) = w o H x ( n ) + ζ ( n ) . , E3

where the vector wo is the optimal Wiener filter coefficient vector for the problem at hands and the process ζ(n) is the irreducible error that is statistically independent of all the other signals.

The outputs of the two adaptive filters are combined according to

y ( n ) = λ ( n ) y 1 ( n ) + [ 1 λ ( n ) ] y 2 ( n ) , E4

where y i ( n ) = w i H ( n 1 ) x ( n ) and the mixing parameter λ(n) can be any real number.

We define the a priori system error signal as difference between the output signal of the optimal Wiener filter at time n, given by y o ( n ) = w o H x ( n ) = d ( n ) ζ ( n ) , and the output signal of our adaptive scheme y(n)

e a ( n ) = y o ( n ) λ ( n ) y 1 ( n ) ( 1 λ ( n ) ) y 2 ( n ) . E5

Let us now find λ(n) by minimizing the mean square of the a priori system error. The derivative of E [ | e a ( n ) | 2 ] with respect to λ(n) reads

E [ | e a ( n ) | 2 ] λ ( n ) = 2 E [ R e { ( y o ( n ) y 2 ( n ) ) ( y 2 ( n ) y 1 ( n ) ) } + λ ( n ) | ( y 2 ( n ) y 1 ( n ) ) | 2 ] . E6

Setting the derivative to zero results in

λ ( n ) = E [ R e { ( d ( n ) y 2 ( n ) ) ( y 1 ( n ) y 2 ( n ) ) } ] E [ | ( y 1 ( n ) y 2 ( n ) ) | 2 ] , E7

where we have replaced the Wiener filter output signal yo(n) by its observable noisy version d(n). Note however, that because the input signal x(n) and irreducible error ζ(n) are independent random processes, this can be done without introducing any error into our calculations. The denominator of equation (7) comprises expectation of the squared difference of the two filter output signals. This quantity can be very small or even zero, particularly in the beginning of adaptation if the two step sizes are close to each other. Correspondingly computed directly from (7) may be large. To avoid this from happening we add a small regularization constant ε to the denominator of (7). The constant ε should be selected small compared to E [ x T ( n ) x ( n ) ] but large enough to prevent division by zero in given arithmetic.

Advertisement

3. System Identification

In several areas it is essential to build a mathematical model of some phenomenon or system. In this class of applications, the adaptive filter can be used to find a best fit of a linear model to an unknown plant. The plant and the adaptive filter are driven by the same known input signal and the plant output provides the desired signal of the adaptive filter. The plant can be dynamic and in this case we have a time varying model. The system identification configuration is depicted in Figure 2. As before x(n) is the input signal, v(n) is the measurement noise, y(n) is the adaptive filter output signal and e(n) is the error signal. The desired signal is d ( n ) = w o H x ( n ) + ζ ( n ) , where wo is the vector of Wiener filter coefficients and the irreducible error ζ(n) consists of the measurement noise v(n) together with the effects of the plant that can not be explained with a length N linear model. The result of pure system identification problem is the vector of adaptive filter coefficients.

Figure 2.

Block diagram of generelized sidelobe canceller.

The same basic configuration is also used to solve the echo and noise cancellation problems. In echo cancellation the unknown plant is the echo path either electrical or acoustical and the input signal x(n) is the speech signal of one of the parties of telephone conversation. Speech of the other party is contained in the signal v(n). The objective is to cancel the components of the desired signal that are due to the input x(n).

In noise cancellation problems the signal v(n) is the primary microphone signal containing noise and the signal to be cleaned. The input signal x(n) is formed by the reference microphones. The reference signals are supposed to be correlated with the noise in the primary signal but not with the useful signal. The objective here is to suppress the noise and clean the signal of interest i.e. v(n).

In here we are going to use the combination of two adaptive filters described in the previous Section to solve the system identification problem.

3.2 Excess Mean Square Error

In this section we are interested in finding expressions that characterize transient performance of the combined algorithm i.e. we intend to derive formulae that predict entire course of adaptation of the algorithm. Before we can proceed we need, however, to introduce some notations.

First let us denote the weight error vector of i-th filter as

w ˜ i ( n ) = w o w i ( n ) . E8

Then the equivalent weight error vector of the combined adaptive filter will be

w ˜ ( n ) = λ w ˜ 1 ( n ) + ( 1 λ ) w ˜ 2 ( n ) . E9

The mean square deviation of the combined filter M S D = E [ w ˜ H ( n ) w ˜ ( n ) ] is given by

M S D = λ 2 E [ w ˜ 1 H ( n ) w ˜ 1 ( n ) ] + 2 λ ( 1 λ ) R e { E [ w ˜ 2 H ( n ) w ˜ 1 ( n ) } ] + ( 1 λ ) 2 E [ w ˜ 2 H ( n ) w ˜ 2 ( n ) ] . E10

The a priori estimation error of an individual filter is defined as

e i , a ( n ) = w ˜ i H ( n 1 ) x ( n ) . E11

It follows from (5) that we can express the a priori error of the combination as

e a ( n ) = λ ( n ) e 1 , a ( n ) + 1 λ ( n ) e 2 , a ( n ) E12

and because λ(n) is according to (7) a ratio of mathematical expectations and, hence, deterministic, we have for the excess mean square error of the combination, E M S E ( n ) = E [ | e a ( n ) | 2 ] ,

E [ | e a ( n ) | 2 ] = λ 2 E [ | e 1 , a ( n ) | 2 ] + 2 λ ( 1 λ ) E [ R e { e 1 , a ( n ) e 2 , a ( n ) } ] + ( 1 λ ) 2 E [ | e 2 , a ( n ) | 2 ] . E13

As e i , a ( n ) = w ˜ i H ( n 1 ) x ( n ) , the expression of the excess mean square error becomes

E [ | e a ( n ) | 2 ] = λ 2 E [ w ˜ 1 H x x H w ˜ 1 ] + 2 λ ( 1 λ ) E [ R e { w ˜ 1 H x x H w ˜ 2 } ] + ( 1 λ ) 2 E [ w ˜ 2 H x x H w ˜ 2 ] . E14

In what follows we often drop the explicit time index n as we have done in (14), if it is not necessary to avoid a confusion.

Noting that y i ( n ) = w i H ( n 1 ) x ( n ) , we can rewrite the expression for λ (n) in (7) as

λ ( n ) = E [ w ˜ 2 H x x H w ˜ 2 ] E [ R e { w ˜ 2 H x x H w ˜ 1 } ] E [ w ˜ 1 H x x H w ˜ 1 ] 2 E [ R e { w ˜ 1 H x x H w ˜ 2 } ] + E [ w ˜ 2 H x x H w ˜ 2 ] . E15

We thus need to investigate the evolution of the individual terms of the type E M S E k , l = E [ w ˜ k H ( n 1 ) x ( n ) x H ( n ) w ˜ l ( n 1 ) ] in order to reveal the time evolution of EMSE(n) and λ(n). To do so we, however, concentrate first on the mean square deviation defined in (10).

Reformulation the relation (1) as

e i ( n ) = d ( n ) w i H ( n 1 ) x ( n ) = e o ( n ) + w ˜ i H ( n 1 ) x ( n ) E16

and subtracting (2) from wo we have

w ˜ i ( n ) = I μ i x x H w ˜ i ( n 1 ) μ i x e o ( n ) . E17

We next approximate the outer product of input signal vectors by its correlation matrix x x H R x . The approximation is justified by the fact that with small step size the weight error update of the LMS algorithm (17) behaves like a low pass filter with a low cutoff frequency. With this approximations we have

w ˜ i ( n ) I μ i R x w ˜ i ( n 1 ) μ i x e o ( n ) . E18

This means in fact that we apply the small step size theory [9] even if the assumption of small step size is not really true for the fast adapting filter. In our simulation study we will see, however, that the assumption works in practice rather well.

Let us now define the eigendecomposition of the correlation matrix as

Q H R x Q = Ω , E19

where Q is a unitary matrix whose columns are the orthogonal eigenvectors of Rx and Ω is a diagonal matrix having eigenvalues associated with the corresponding eigenvectors on its main diagonal. We also define the transformed weight error vector as

v i ( n ) = Q H w ˜ i ( n ) E20

and the transformed last term of equation (18) as

p i ( n ) = μ i Q H x e o ( n ) . E21

Then we can rewrite the equation (18) after multiplying both sides by QH from the left as

v i ( n ) = I μ i Ω v i ( n 1 ) p i ( n ) . E22

We note that the mean of pi is zero by the orthogonality theorem and the crosscorrelation matrix of pk and pl equals

E [ p k p l H ] = μ k μ l Q H E [ x e o ( n ) e o ( n ) x H ] Q . E23

We now invoke the Gaussian moment factoring theorem to write

E [ x e o ( n ) e o ( n ) x H ] = E [ x e o ( n ) ] E [ e o ( n ) x H ] + E [ x x H ] E [ | e o | 2 ] . E24

The first term in the above is zero due to the principle of orthogonality and the second term equals R J m i n , where J m i n = E [ | e o | 2 ] is the minimum mean square error produced by the corresponding Wiener filter. Hence we are left with

E [ p k p l H ] = μ k μ l J m i n Ω . E25

As the matrices I and Ω in (22) are both diagonal, it follows that the m-th element of vector vi(n) is given by

v i , m ( n ) = 1 μ i ω m v i , m ( n 1 ) p i , m ( n ) = 1 μ i ω m n v m ( 0 ) + i = 0 n 1 1 μ i ω m n 1 i p i , m ( i ) , E26

where ω m is the m-th eigenvalue of Rx and vi,m and pi,m are the m-th components of the vectors vi and pi respectively.

We immediately see that the mean value of vi,m(n) equals

E [ v i , m ( n ) ] = 1 μ i ω m n v m ( 0 ) E27

as the vector pi has zero mean.

To proceed with our development for the combination of two LMS filters we note that we can express the MSD and its individual components in (10) through the transformed weight error vectors as

E [ w ˜ k H ( n ) w ˜ l ( n ) ] = E [ v k H ( n ) v l ( n ) ] = m = 0 N 1 E [ v k , m ( n ) v l , m ( n ) ] E28

so we also need to find the auto- and cross correlations of v.

Let us concentrate on the m-th component in the sum above corresponding to the cross term. The expressions for the component filters follow as special cases. Substituting (26) into the expression of m-th component of MSD above, taking the mathematical expectation and noting that the vector p is independent of v(0) results in

E [ v k , m ( n ) v l , m ( n ) ] = E 1 μ k ω m n v k ( 0 ) 1 μ l ω m n v l ( 0 ) + E i = 0 n 1 j = 0 n 1 1 μ k ω m n 1 i 1 μ l ω m n 1 j p k , m ( i ) p l , m ( j ) . E29

We now note that most likely the two component filters are initialized to the same value      

                                                       [ v k , m ( 0 ) = v l , m ( 0 ) = v m ( 0 ) ]

and that

E p k , m ( i ) p l , m ( j ) = μ k μ l ω m J m i n , i = j 0 , otherwise . E30

We then have for the m-th component of MSD

E [ v k , m ( n ) v l , m ( n ) ] = 1 μ k ω m n 1 μ l ω m n | v m ( 0 ) | 2 + μ k μ l ω m J m i n 1 μ k ω m n 1 1 μ l ω m n 1 i = 0 n 1 1 μ k ω m i 1 μ l ω m i . E31

The sum over i in the above equation can be recognized as a geometric series with n terms. The first term is equal to 1 and the geometric ratio equals 1 μ k ω m 1 1 μ l ω m 1 . Hence we have

i = 0 n 1 1 μ k ω m i 1 μ l ω m i = 1 1 μ k λ m 1 1 μ l λ m 1 n 1 1 μ k λ m 1 1 μ l λ m 1 = 1 μ k ω m 1 μ l ω m μ k μ l ω m 2 μ k ω m μ l ω m 1 μ k ω m n + 1 1 μ l ω m n + 1 μ k μ l ω m 2 μ k ω m μ l ω m . E32

After substitution of the above into (31) and simplification we are left with

E [ v k , m ( n ) v l , m ( n ) ] = 1 μ k ω m n 1 μ l ω m n | v m ( 0 ) | 2 + J m i n ω m 2 ω m μ l ω m μ k J m i n ω m 2 ω m μ l ω m μ k , E33

which is our result for a single entry to the MSD crossterm vector. It is easy to see that for the terms involving a single filter we get an expressions that coincide with the one available in the literature [9].

Let us now focus on the cross term

                                                E M S E k l = E w ˜ k H ( n 1 ) x ( n ) x H ( n ) w ˜ l ( n 1 ) ,

appearing in the EMSE equation (14). Due to the independence assumption we can rewrite this using the properties of trace operator as

E M S E k l = E w ˜ k H ( n 1 ) R x w ˜ l ( n 1 ) = T r E R x w ˜ l ( n 1 ) w ˜ k H ( n 1 ) = T r R x E w ˜ l ( n 1 ) w ˜ k H ( n 1 ) . E34

Let us now recall that according to (20) for any of the filters w ˜ i ( n ) = Q v i ( n ) so that we are justified to write

E M S E k l = T r R x E Q v l ( n 1 ) v k H ( n 1 ) Q H = T r E v k H ( n 1 ) Q H R x Q v l ( n 1 ) = T r E v k H ( n 1 ) Ω v l ( n 1 ) = i = 0 N 1 ω i E v k , i ( n 1 ) v l , i ( n 1 ) . E35

The EMSE of the combined filter can now be computed as

E M S E = i = 0 N 1 ω i E | λ ( n ) v k , i ( n 1 ) + ( 1 λ ( n ) v l , i ( n 1 ) | 2 , E36

where the components of type E [ v k , i ( n 1 ) v l , i ( n 1 ) ] are given by (33). To compute λ(n) we use (15) substituting (35) for its individual components.

Advertisement

4. Adaptive Sensor Array

In this Chapter we describe how to use the combination of two adaptive filters in an adaptive beamformer. The beamformer we employ here is often termed as Generalized Sidelobe Canceller [9].

Let φ denote the the angle of incidence of a planar wave impinging a linear sensor array, measured with respect to the normal to the array. The electrical angle θ is related to the incidence angle as

θ = 2 π δ λ sin ϕ , E37

where is the wavelength of the incident wave and δ is the spacing between adjacent sensors of the linear array.

Suppose that the signal impinging the array of M=N+1 sensors is given by

u ( n ) = A ( Θ ) s ( n ) + v ( n ) , E38

where s(n) is the vector of emitter signals, Θ is a collection of directions of arrivals, A(Θ) is the array steering matrix with its columns a(θ) defined as responses toward the individual sources s(n) and v(n) is a vector of additive circularly symmetric Gaussian noise. The M vectors

a ( θ ) = 1 , e j θ , , e j ( M 1 ) θ T E39

are called the steering vectors of the respective sources. We assume that the source of interest is located at the electrical angle θ 0.

The block diagram of the Generalized Sidelobe Canceller is shown in Figure 3. The structure consists of two branches. The upper branch is the steering branch, that directs its beam toward the desired source. The lower branch is the blocking branch that blocks the signals impinging at the array from the direction of the desired source and includes an adaptive algorithm that minimizes the mean square error between the output signals of the branches.

The weights in steering branch ws are selected from the condition

w s H a ( θ 0 ) = g E40

i.e. we require the response in the direction of the source of interest θ 0 to equal a constant g. Common choices for g are g=M and g=1. Here we have used g=M.

The signal at the output of the upper branch is given by

d ( n ) = w s H u ( n ) . E41

In the lower branch we have a blocking matrix, that will block any signal coming from the direction θ 0 . The columns of the M × M-1 blocking matrix Cb are defined as being the orthogonal complement of the steering vector a(θ0) in the upper branch

a H ( θ 0 ) C b = 0 . E42

The vector valued signal x(n) at the output of the blocking matrix is formed as

x ( n ) = C b H u ( n ) . E43

Figure 3.

Block diagram of generelized sidelobe canceller.

The output of the algorithm is

e ( n ) = d ( n ) w b H ( n ) x ( n ) . E44

The signals x(n) and d(n) can be used as the input and desired signals respectively in an adaptive algorithm to select the blocking weights wb . In this Chapter we use the combination of two adaptive filters that gives us fast initial convergence and low steady state misadjustment at the same time.

4.1 Signal to Interference and Noise Ratio

The EMSE of the adaptive algorithm can be analysed as it is done in Section 3.1. In this application we are also interested in signal to interference and noise ratio (SINR) at the array output. To evaluate this we first note that the power that signal of interest generates at the array output is according to (40)

P s = w s H a ( θ 0 ) σ s 0 2 a H ( θ 0 ) w s = | g | 2 σ s 0 2 , E45

where σ s0 2 is the variance of the useful signal arriving from the angle θ 0 .

To find the interference and noise power we first define the reduced signal vector s ˘ and a reduced DOA collection Θ ˘ where we have left out the signal of interest and the steering vector corresponding to the useful signal but kept all the interferers and the interference steering vectors. The corresponding array steering matrix is A ˘ ( Θ ˘ ) .

The correlation matrix of interference and noise in the signal x(n), which is the input signal to our adaptive scheme, is then given by

R ˘ x = C b H A ˘ ( Θ ˘ ) E [ s ˘ s ˘ H ] A ˘ H ( Θ ˘ ) C b + C s H C b σ v 2 , E46

where sigmav 2 is the noise variance, the first component in the summation is due to the interfering sources and the second component is due to the noise.

It follows from the standard Wiener filtering theory that the minimum interference and noise power at the array output is given by

J ˘ m i n = σ i n t , v 2 p ˘ H R ˘ 1 p ˘ , E47

where the desired signal variance excluding the signal from the source of interest is

σ i n t , v 2 = w s H A ˘ R ˘ A ˘ H w s + σ v 2 w s H w s E48

and the crosscorrelation vector between the adaptive filter input signal and desired signal excluding the signal from source of interest is

p ˘ = C b H A ˘ ( Θ ˘ ) E [ s ˘ s ˘ H ] A ˘ H ( Θ ˘ ) w s + σ v 2 C b H A ˘ ( Θ ˘ ) w s . E49

We can now find the eigendecomposition of R ˘ x and use the resulting eigenvalues in (35) and (36) to find the excess mean square error due to interference and noise only EMSEint,v . The error power can be computed as minimum interference and noise power at the array output plus excess mean square error due to interference and noise only

P v , i n t = J ˘ m i n + E M S E i n t , v ( n ) E50

and the signal to noise ratio is thus given by

S N R ( n ) = P s P v , i n t ( n ) . E51

Advertisement

5. Adaptive Line Enhancer

Adaptive line enhancer is a device that is able to separate the input into two components. One of them consists mostly of the narrow-band signals that are present at the input and the other one consists mostly of the broadband noise. In the context of this paper the signal is considered to be of narrow band if its bandwidth is small as compared to the sampling frequency of the system.

Figure 4.

The adaptive line enhancer.

We assume that the broadband noise is zero mean, white and Gaussian and that the narrow band component is centred. One is usually interested in the narrow band components and the device is often used to clean narrow band signals from noise before any further processing. The line enhancer is shown in Figure 4. Note that the input signal to the adaptive filter of the line enhancer is delayed by Δ sample times and the input vector is thus x(n-Δ). The desired signal is d(n) = x(n). The line enhancer is in fact a Δ step predictor. The device is able to predict the narrow band components that have long correlation times but it cannot predict the white noise and hence, only a prediction of narrow band components appears in the filter output signal y(n). The signal y(n) is also the output of the system.

Let us now find the autocorrelation function of the enhancer output signal y(n). We make the standard assumption from independence theory which states that the filter weights and the input signal are independent [9].

The l-th autocorrelation lag of the filter output process r ( l ) = E [ y ( n ) y ( n + l ) ] , equals

r ( l ) = E [ w H ( n 1 ) x ( n Δ ) x H ( n Δ + l ) w ( n 1 + l ) ] . E52

The input signal x(n) consists of two uncorrelated components s(n), the sum of narrow band signals, and v(n), the additive noise

x ( n ) = s ( n ) + v ( n ) . E53

We can decompose the impulse response of the adaptive filter into two components. One of them is the optimal Wiener filter for the problem

w o = E [ x ( n Δ ) x H ( n Δ ) ] 1 E [ x ( n Δ ) x ( n ) ] E54

and the other one, w ˜ ( n ) , represents the estimation errors.

w ˜ ( n ) = w o w ( n ) . E55

The output signal can hence be expressed as

y ( n ) = y o ( n ) y ˜ ( n ) . E56

Substituting (53) and (55) into (52) and noticing that the cross-correlation between the Wiener filter output and that of the filter defined by weight errors is

E [ y o ( n ) y ˜ ( l ) ] = E [ w o H x ( n Δ ) x H ( l Δ ) w ˜ ( n 1 ) ] = 0 E57

because of the adopted independence assumption and because E [ w ˜ ( n 1 ) ] = E [ w o w ( n 1 ) ] = 0 , we have

r ( l ) = E [ w o H { s ( n Δ ) + v ( n Δ ) } { s H ( n Δ + l ) + v H ( n Δ + l ) } w o ] + E [ w ˜ H ( n 1 ) { s ( n Δ ) + v ( n Δ ) } { s H ( n Δ + l ) + v H ( n Δ + l ) } w ˜ ( n 1 + l ) ] . E58

Developing and grouping terms in the above equation results in

r ( l ) = E [ w o H s ( n Δ ) s H ( n Δ + l ) w o ] + E [ w o H v ( n Δ ) v H ( n Δ + l ) ] w o ] + E [ w ˜ H ( n 1 ) s ( n Δ ) s H ( n Δ + l ) w ˜ ( n 1 + l ) ] + E [ w ˜ H ( n 1 ) v ( n Δ ) ] v H ( n Δ + l ) w ˜ ( n 1 + l ) ] . E59

Using the fact that wo is deterministic and the properties of the trace operator we further obtain

r ( l ) = w o H E [ s ( n Δ ) s H ( n Δ + l ) ] w o + w o H E [ v ( n Δ ) v H ( n Δ + l ) ] w o + E [ T r { w ˜ ( n 1 + l ) w ˜ H ( n 1 ) s ( n Δ ) s H ( n Δ + l ) } ] + E [ T r { w ˜ ( n 1 + l ) w ˜ H ( n 1 ) v ( n Δ ) v H ( n Δ + l ) } ] . E60

We now invoke the independence assumption saying that the weight vector w ˜ H ( n 1 ) is independent from the signals s ( n Δ ) and v ( n Δ ) . This leads us to

r ( l ) = w o H E [ s ( n Δ ) s H ( n Δ + l ) ] w o + w o H E [ v ( n Δ ) v H ( n Δ + l ) ] w o + T r { E [ w ˜ ( n 1 + l ) w ˜ H ( n 1 ) ] E [ s ( n Δ ) s H ( n Δ + l ) ] } + T r { E [ w ˜ ( n 1 + l ) w ˜ H ( n 1 ) ] E [ v ( n Δ ) v H ( n Δ + l ) ] } . E61

To proceed we need to find the matrix K ( l ) = E [ w ˜ ( n 1 + l ) w ˜ H ( n 1 ) ] .

5.1 Weight error correlation matrix

In this Section we investigate the combination of two adaptive filters and derive the expressions for the crosscorrelation matrix between the output signals of the individual filters yi(n) and yk(n). The autocorrelation matrices of the individual filter output signals follow directly using only one signal in the formulae.

For the problem at hands we can rewrite the equation (18) noting that we have introduced a Δ samples delay in the signal path as

w ˜ i ( n ) I μ i R x w ˜ i ( n 1 ) μ i x ( n Δ ) e o ( n ) . E62

For the weight error correlation matrix we then have

 

                 K i , k , l ( n ) = E [ w ˜ i ( n + l ) w ˜ k H ( n ) ] = E I μ i R x w ˜ i ( n + l 1 ) w ˜ k ( n 1 ) H I μ k R x E I μ i R x w ˜ i ( n + l 1 ) μ k x H ( n Δ ) e o ( n ) E μ i x ( n Δ ) e o ( n ) I μ k R x w ˜ k H ( n 1 ) + E μ i μ k x ( n Δ ) e o ( n ) e o ( n ) x H ( n Δ ) .

The second and third terms of the above equal zero because we have made the usual independence theory assumptions which state, that the weight errors w ˜ i ( n ) are independent of the input signal x(n-Δ). To evaluate the last term we assume that the adaptive filters are long enough to remove all the correlation between eo(n) and x(n-Δ). In this case we can rewrite the above as

K i , k , l ( n ) = I μ i R x K i , k , l ( n 1 ) I μ k R x + μ i μ k J m i n R x , E63

where J m i n = E [ | e o | 2 ] is the minimum mean square error produced by the corresponding Wiener filter.

We now assume that the signal to noise ratio is low so that the input signal is dominated by the white noise process v(n). In this case we can approximate the correlation matrix of the input process by unit matrix as

R x σ v 2 I , E64

where σ v 2 is the noise variance. Later in the simulation study we will see that the theory developed this way actually works well with quite moderate signal to noise ratios. Then substituting (64) into (63) yields

K i , k , l ( n ) = I μ i σ v 2 I K i , k , l ( n 1 ) I μ k σ v 2 I + μ i μ k J m i n σ v 2 I . E65

In steady state, when n we have

K i , k , l ( ) = 1 μ i σ v 2 K i , k , l ( ) 1 μ k σ v 2 + μ i μ k J m i n σ v 2 I . E66

Solving the above for K i , k , l ( ) we have

K i , k , l ( ) = μ i μ k J m i n μ i σ x 2 + μ k σ x 2 μ i μ k σ v 2 I . E67

5.2. Second order statistics of line enhancer output signal

As we see from the previous discussion, the correlation matrix of the weight error vector is diagonal. We therefore have that the matrix K i , k ( l ) = E [ w ˜ i ( n 1 + l ) w ˜ k H ( n 1 ) ] has in steady state, when n , elements different form zero only alongside the main diagonal and the elements at this diagonal equal to μ i μ k J m i n μ i σ x 2 + μ k σ x 2 μ i μ k σ v 2 . Substituting K i , k ( l ) into (61) we now have that the l-th correlation lag of the output signal is equal to

r i , k ( l ) = w o H E [ s ( n Δ ) s H ( n Δ + l ) ] w o + w o H E [ v ( n Δ ) v H ( n Δ + l ) ] w o + r s ( l ) N μ i μ k J m i n μ i σ x 2 + μ k σ x 2 μ i μ k σ v 2 + T r { K i , k ( l ) E [ v ( n Δ ) v H ( n Δ + l ) ] } , E68

where rs(l) is the l-th autocorrelation lag of the input signal s(n).

As the noise v has assumed to be white, the matrix E [ v ( n Δ ) v H ( n Δ + l ) ] has nonzero elements σ 2 v only along the l-th diagonal and the rest of the matrix is filled with zeroes. Then

r i , k ( l ) = w o H E [ s ( n Δ ) s H ( n Δ + l ) ] w o + σ v 2 i = 0 N l 1 w o ( i ) w o ( i + l ) + r s ( l ) N μ i μ k J m i n μ i σ x 2 + μ k σ x 2 μ i μ k σ v 2 + r 0 , E69

where r 0 = N σ v 2 μ i μ k J m i n μ i σ x 2 + μ k σ x 2 μ i μ k σ v 2 , if l = 0 and zero otherwise.

From (4) we see that the autocorrelation lags of the combination output signal y(n) can be composed from its components ri,k(l) as follows

                 r ( l ) = λ ( n ) 2 E [ y 1 ( n ) y 1 ( n + l ) ] + 2 λ ( n ) ( 1 λ ( n ) ) E [ R e { y 1 ( n ) y 2 ( n + l ) } ] + ( 1 λ ( n ) ) 2 E [ y 2 ( n ) y 2 ( n + l ) ] = λ ( n ) 2 r 1 , 1 ( l ) + 2 λ ( n ) ( 1 λ ( n ) ) R e { r 1 , 2 ( l ) } + ( 1 λ ( n ) ) 2 r 2 , 2 ( l ) .

The autocorrelation matrix of y is a Toeplitz matrix R having the autocorrleation lags r(l) along its first row.

Thus far we have evaluated the terms E [ y i ( n ) y k ( n + l ) ] , what remains is to find an expression for the steady state combination parameter λ ( ) . For this purpose we can use (15), noting that y i ( n ) = w i H ( n 1 ) x ( n Δ ) . All the terms in the expression (15) are similar and we need to evaluate

γ i k = E w ˜ i H ( n 1 ) x ( n Δ ) x H ( n Δ ) w ˜ k ( n 1 ) . E70

Due to the independence assumption we can rewrite (70) using the properties of trace operator as

γ i k = T r E x ( n Δ ) x H ( n Δ ) w ˜ k ( n 1 ) w ˜ i H ( n 1 ) = T r R x E w ˜ k ( n 1 ) w ˜ i H ( n 1 ) = T r R x K i , k , 0 ( n 1 ) . E71

We are now ready to find λ ( ) by substituting (71) and (67) into (15).

The power spectrum of the output process y(n) is given by

P ( f ) = lim K 1 K E | [ Y K ( f ) | 2 ] = l = r ( l ) e j 2 π l f , E72

where YK(f) is the length K discrete Fourier transform of the signal y(n) and f is the frequency. There is a number of methods to compute an estimate of the power spectrum from the correlation matrix of a signal. In this paper we have used the Capon method [20].

P ˆ ( f ) = K a H ( f ) R 1 a ( f ) , E73

where a ( f ) = [ 1 e j 2 π f e j 2 π ( M 1 ) f ] T and R is the K × K Toeplitz correlation matrix of the signal of interest. The Capon method was chosen because the signals we are interested in are sine waves in noise and the Capon method gives a more distinct spectrum estimate than the Fourier transform based methods in this situation.

Advertisement

6. Simulation Results

In this Section we present the results of our simulation study.

In order to obtain a practical algorithm, the expectation operators in both numerator and denominator of (7) have been replaced by exponential averaging of the type

P a v ( n ) = ( 1 γ ) P a v ( n 1 ) + γ p ( n ) , E74

where p(n) is the quantity to be averaged, Pav(n) is the averaged quantity and γ is the smoothing parameter. The averaged quantities were then used in (7) to obtain . The curves shown in the Figures to follow are averages over 100 independent trials. We often show the simulation results and the theoretical curves in the same Figures. In several cases the curves overlap and are therefore indistinguishable.

Figure 5.

The true impulse response.

6.1 System Identification

We have selected the sample echo path model number one shown in Figure 5 from [10], to be the unknown system to identify and combined two 64 tap long adaptive filters.

In the Figures below the noisy blue line represents the simulation result and the smooth red line is the theoretical result. The curves are averaged over 100 independent trials.

In the system identification example we use Gaussian white noise with unity variance as the input signal. The measurement noise is another white Gaussian noise with variance σ υ 2 = 10 3 MathType@MTEF@5@5@+=feaagCart1ev2aaatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeo8aZnaaDaaaleaacqaHfpqDaeaacaaIYaaaaOGaeyypa0JaaGymaiaaicdadaahaaWcbeqaaiabgkHiTiaaiodaaaaaaa@3EB8@ . The step sizes are μ 1 =0.005 MathType@MTEF@5@5@+=feaagCart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabg2da9iaaicdacaGGUaGaaGimaiaaicdacaaI1aaaaa@3D36@ for the fast adapting filter and μ 2 =0.005 MathType@MTEF@5@5@+=feaagCart1ev2aqatCvAUfeBSjuyZL2yd9gzLbvyNv2CaerbuLwBLnhiov2DGi1BTfMBaeXatLxBI9gBaerbd9wDYLwzYbItLDharqqtubsr4rNCHbGeaGqiVCI8FfYJH8YrFfeuY=Hhbbf9v8qqaqFr0xc9pk0xbba9q8WqFfeaY=biLkVcLq=JHqpepeea0=as0Fb9pgeaYRXxe9vr0=vr0=vqpWqaaeaabiGaciaacaqabeaadaqaaqaaaOqaaiabeY7aTnaaBaaaleaacaaIXaaabeaakiabg2da9iaaicdacaGGUaGaaGimaiaaicdacaaI1aaaaa@3D36@ for the slowly adapting filter. Figure 6 depicts the evolution of EMSE in time. One can see that the system converges fast in the beginning. The fast convergence is followed by a stabilization period between sample times 1000-7000 followed by another convergence to a lower EMSE level between the sample times 8000-12000. The second convergence occurs when the mean squared error of the filter with small step size surpasses the performance of the filter with large step size. One can observe that the there is a good accordance between the theoretical and the simulated curves so that the theoretical and the simulation curves are difficult to distinguish from each other.

The combination parameter is shown in Figure 7. At the beginning, when the fast converging filter gives smaller EMSE than the slowly converging one, is clise to unity. When the slow filter catches up the fast one starts to decrease and obtains a small negative value at the end of the simulation example. The theoretical and simulated curves fit well.

In the Figure 8 we show the time evolution of mean square deviation of the combination in the same test case. Again one can see that the theoretical and simulation curves fit well.

Figure 6.

Time-evolutions of EMSE with μ 1 = 0.005 and μ 2 = 0.0005 and σ v 2 = 10-3 .

Figure 7.

Time-evolutions of λ with μ 1 = 0.005 and μ 2 = 0.0005 and σ v 2 = 10-3 .

6.2 Adaptive beamforming

In the beamforming example we have used a 8 element linear array with half wave-length spacing. The noise power is 10-4 in this simulation example. The useful signal which is 10 dB stronger than the noise arrives form the broadside of the array. There are three strong interferers at -35°, 10° and 15 with SNR1 = 33 dB and SNR2 = SNR3 = 30 dB respectively. The step sizes of the adaptive combination are 1 = 0.05 and μ 2= 0.006.

Figure 8.

Time-evolutions of MSD with μ1 = 0.005 and mu;2 = 0.0005 and σv 2 = 10-3.

Figure 9.

The antenna pattern.

The steady state antenna pattern is shown in Figure 6. One can see that the algorithm has formed deep nulls in the directions of the interferers while the response in the direction of the useful signal is equal to the number of antennas i.e. 8.

The evolution of EMSE in this simulation example is depicted in Figure 10. One can see a rapid convergence at the beginning of the simulation example. Then the EMSE value stabilizes at a certain level and after a while a second convergence occurs. The dashed red line is the theoretical result and the solid blue line is the simulation result. One can see that the two overlap and are indistinguishable in black and white print.

Figure 10.

Time evolution of EMSE.

The time evolution of for this simulation example is shown in Figure 11. At the beginning is close to one forcing the output signal of the fast adapting filter to the output of the combination. Eventually the slow filter catches up with the fast one and starts to decrease obtaining at the end of the simulation example a small negative value so that the output signal is dominated by the output signal of the slowly converging filter. One can see that the simulation and theoretical curves for evolution are close to each other.

The signal to interference and noise ratio evolution is show in Figure 12. One can see a fast improvement of SINR at the beginning of the simulation example followed by a stabilization region. After a while a new region of SINR improvement occurs and finally the SINR stabilizes at an improved level. Again the theoretical result matches the simulation curve well making the curves indistinguishable in black and white print.

6.3. Adaptive Line Enhancer

In order to illustrate the adaptive line enhancer application we have used length K = 32 correlation sequences to form K × K correlation matrices for the Capon method. The narrow band signals were just sine waves in our simulations.

Figure 11.

Time evolution of λ.

Figure 12.

Time evolution of SINR.

The input signal consist of three sine waves and additive noise with unity variance. The sine waves with frequencies 0.1 and 0.4 have amplitudes equal to one and the third sine wave with normalized frequency 0.25 has amplitude equal to 0.5. The spectra of the input signal x(n) and the output signal y(n) are shown in Figure 13. The step sizes used were μ 1 = 0.5 and μ 2 = 0.05, the filter is N=16 taps long and the delay Δ = 10.

Figure 13.

Line enhancer output signal spectrum.

Figure 14.

Line enhancer output signal autocorrelation.

In Figure 14 we show the correlation functions of input and output signals in the second simulation example. We can see that the theoretical correlation matches the correlation computed from simulations well.

Figure 15.

Evolution of EMSE of the two component filters and the combination in time.

Figure 16.

Line enhancer output signal spectrum.

The evolution of the excess mean square error of the combination together with that of the individual filters is shown in Figure 15. We see the fast initial convergence, which is due to the fast adapting filter. After the initial convergence there is a period of stabilization followed by a second convergence between the sample times 500 and 1500, when the error power of the slowly adapting filter bypasses that of the fast one.

In our final simulation example (Figure 16) we use three unity amplitude sinusoids with frequencies 0.1, 0.2 and 0.4. We have increased the noise variance to 10 so that the noise power is 20 times the power of each of the individual sinusoids. The adaptive filter is N=16 taps long and the delay Δ = 10. The step sizes of the individual filters in the combined structure are μ 1 = 0.5 and μ 2 = 0.005. One can see that even in such noisy conditions there is still a reasonably good match between the theoretical and simulation results.

Advertisement

7. Conclusions

In order to make the LMS type adaptive algorithm work properly one has to select a suitable step size . The step size has to be smaller than 2 ω m a x , where ω m a x is the largest eigenvalue of the input signal autocorrelation matrix, in order to guarantee stability of the algorithm. Given that the stability condition is fulfilled a large step size allows the algorithm to initially converge fast but the mean square error in steady state remains large too. On the other hand if one selects a small step size it is possible to achieve a small steady state error but the initial convergence speed of the algorithm is reduced in this case. In this Chapter we have investigated the combination of two adaptive filters, which is a new and interesting way of achieving fast initial convergence and low steady state error of an adaptive filter at the same time, solving thus the trade-off one has in step size selection . We were looking at three applications of the technique - system identification, adaptive beamforming and adaptive line enhancing. In all three applications we saw that the combination worked as expected allowing the algorithm to converge fast to a certain level and then after a while providing a second convergence to a lower mean square error value.

References

  1. 1. Aboulnasr T. Mayyas K. 1997 A robust variable step-size LMS-type algorithm: Analysis and simulations IEEE Transactions on Signal Processing 45 631 639
  2. 2. Arenas-Garcia J. Figueiras-Vidal A. R. Sayed A. H. 2006 Mean-square performance of convex combination of two adaptive filters IEEE Transactions on Signal Processing 54 1078 1090
  3. 3. Armbruster W. 1992 Wideband Acoustic Echo Canceller with Two Filter Structure, Signal Processing VI, Theories and Applications Vanderwalle J Boite R. Moonen M. Oosterlinck A. Elsevier Science Publishers B.V.
  4. 4. Azpicueta-Ruiz L. A. Figueiras-Vidal A. R. Arenas-Garcia J. 2008a A new least squares adaptation scheme for the affine combination of two adaptive filters Proc. IEEE International Workshop on Machine Learning for Signal Processing Cancun, Mexico 327 332
  5. 5. Azpicueta-Ruiz L. A. Figueiras-Vidal A. R. Arenas-Garcia J. 2008b A normalized adaptation scheme for the convex combination of two adaptive filters Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing Las Vegas, Nevada 3301 3304
  6. 6. Bershad N. J. Bermudez J. C. Tourneret J. H. 2008 An affine combination of two LMS adaptive filters - transient mean-square analysis IEEE Transactions on Signal Processing 56 1853 1864
  7. 7. Fathiyan A. Eshghi M. 2009 Combining several PBS-LMS filters as a general form of convex combination of two filters Journal of Applied Sciences 9 759 764
  8. 8. Harris R. W. Chabries D. M. Bishop F. A. 1986 Variable step (vs) adaptive filter algorithm IEEE Transactions on Acoustics, Speech and Signal Processing 34 309 316
  9. 9. Haykin S. 2002 Adaptive Filter Theory, Fourth Edition, Prentice Hall
  10. 10. ITU-T Recommendation G.168 Digital Network Echo Cancellers 2009 ITU-T
  11. 11. Kim K. Choi Y. Kim S. Song W. 2008 Convex combination of affine projection filters with individual regularization Proc. 23rd International Technical Conference on Circuits/Systems, Computers and Communications (ITC-CSCC) Shimonoseki, Japan 901 904
  12. 12. Kwong R. H. Johnston E. W. 1992 A variable step size LMS algorithm IEEE Transactions on Signal Processing 40 1633 1642
  13. 13. Mandic D. Vayanos P. Boukis C. Jelfs B. Goh, S. I. Gautama T. Rutkowski T. 2007 Collaborative adaptive learning using hybrid filters Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing Honolulu, Hawaii 901 924
  14. 14. Martinez-Ramon M. Arenas-Garcia J. Navia-Vazquez A. Figueiras-Vidal A. R. 2002 An adaptive combination of adaptive filters for plant identification Proc. 14th International Conference on Digital Signal Processing Santorini, Greece 1195 1198
  15. 15. Mathews V. J. Xie Z. 1993 A stochastic gradient adaptive filter with gradient adaptive step size IEEE Transactions on Signal Processing 41 2075 2087
  16. 16. Ochiai K. 1977 Echo canceller with two echo path models IEEE Transactions on Communications 25 589 594
  17. 17. Sayed A. H. 2008 Adaptive Filters John Wiley and sons
  18. 18. Shin H. C. Sayed A. H 2004 Variable step-size NLMS and affine projection algorithms IEEE Signal Processing Letters 11 132 135
  19. 19. Silva M. T. M. Nascimento V. H. Arenas-Garcia J. 2010 A transient analysis for the convex combination of two adaptive filters with transfer of coefficients Proc. IEEE International Conference on Acoustics, Speech, and Signal Processing Dallas, TX, USA 3842 3845
  20. 20. Stoica P. Moses R. 2005 Spectral Analysis of Signals Prentice Hall.
  21. 21. Trump T. 2009 An output signal based combination of two NLMS adaptive algorithms Proc. 16th International Conference on Digital Signal Processing Santorini, Greece
  22. 22. Trump T. 2011a analysis Output signal based combination of two NLMS adaptive filters - transient Proceedings of the Estonian Academy of Sciences 60 4 258 268
  23. 23. Trump T. 2011b Output statistics of a line enhancer based on a combination of two adaptive filters Central European Journal of Engineering 1 244 252
  24. 24. Zhang Y. Chambers J. A. 2006 Convex combination of adaptive filters for a variable tap-length LMS algorithm IEEE Signal Processing Letters 10 628 631

Written By

Tonu Trump

Submitted: 10 February 2012 Published: 20 February 2013