Applications of Kalman Filters for Coherent Optical Communication Systems Applications of Kalman Filters for Coherent Optical Communication Systems

In this chapter, we review various applications of Kalman filtering for coherent optical communication systems. First, we briefly discuss the principles of Kalman filter and its variations including extended Kalman filter (EKF) and adaptive Kalman filter (AKF). Later on, we illustrate the applicability of Kalman filters for joint tracking of several optical transmission impairments, simultaneously, by formulating the state space model (SSM) and detailing the principles. A detailed methodology is presented for the joint tracking of linear and nonlinear phase noise along with amplitude noise using EKF. Also, approaches to enhance the performance obtained by EKF by combining with other existing digital signal processing (DSP) techniques are presented. Frequency and phase offset estimation using a two stage linear Kalman filter (LKF)/EKF is also discussed. A cascaded structure of LKF and EKF by splitting the SSM to jointly mitigate the effects of polarization, phase and amplitude noise is also presented. The numerical analysis concludes that the Kalman filter based approaches outperform the conventional methods with better tracking capability and faster convergence besides offering more feasibility for real-time implementations.


Introduction
In order to meet the yearning demands of bandwidth and capacity due to ever increasing data traffic, the contemporary research in the field of optical transmission, is focused on developing 400 Gbps and above, Ethernet transmission [1][2][3][4][5]. The achievable information rates using optical fiber as communication channel have been rapidly increased over the past few decades. Some of the technology breakthroughs behind this rapid increase in the transmission capacity, can be listed as the invention and development of the erbium doped fiber amplifiers (EDFA), wavelength division multiplexing (WDM) systems, coherent detection, digital signal processing (DSP) techniques and forward error correction (FEC) schemes ensuring reliable transmission. The advent of coherent detection along with subsequent DSP made it possible to deploy spectrally efficient higher order modulation formats and multiplexing techniques [6,7]. Moreover, it has also made feasible to digitally equalize the optical fiber transmission impairments [8], which are the main hurdle to increase the bandwidth-distance product. The transmission capacity can be increased several times by employing complex modulation formats like m-ary quadrature amplitude modulation (with m = 4, 16, 64 and so on), and multiplexing techniques like polarization division multiplexing (PDM) and WDM. However, they are more vulnerable to the optical transmission impairments as well as to the carrier phase and frequency offset (FO). Hence, effective DSP algorithms for combatting with the channel impairments were under active research over the past decade [8][9][10][11][12][13][14][15][16][17][18][19][20][21][22][23]. Consequently, coherent optical receivers are well developed and employ digital filters that allow for effective equalization of fiber linear impairments like chromatic dispersion (CD) and polarization mode dispersion (PMD) in the electric domain [9]. Typically, CD can be compensated by either frequency or time domain filters using finite impulse response (FIR) or infinite impulse response (IIR) design. Optical receivers exploiting polarization diversity should also compensate for the random fluctuations of the polarization state caused by the stochastic change of fiber birefringence. PMD compensation is widely performed using constant modulus algorithm (CMA) [15] or multi modulus algorithm (MMA) [16]. Attributed to these well-developed linear equalization techniques, fiber nonlinearity still remains a bottleneck for increasing the capacity and transmission reach [24].
Although, multiple information bits being encoded in a single symbol significantly increase the spectral efficiency, the signal becomes more sensitive to the amplified spontaneous emission (ASE) noise that is added in the optical amplifiers along the transmission link. Therefore, a reliable transmission over long distance demands the signal to be launched into the optical fiber at higher power, to ensure a sufficiently high optical signal to noise ratio (OSNR) at the receiver. However, the maximum transmittable launch power per fiber span is constrained by the Kerr nonlinear effects, including self-phase modulation (SPM) and cross phase modulation (XPM) in case of WDM systems, which results in signal degradation [25]. This degrading impact of Kerr nonlinearity is much more severe in multi-channel systems with increasing number of channels [26]. Moreover, the nonlinear phase noise (NLPN) resulting from the signal and ASE noise interactions at high launch powers, deteriorates the signal quality further. On the other hand, signal transmission at low launch powers is limited by the ASE noise. Therefore, mitigation of fiber nonlinearity is vital to enhance the capacity ensuring reliable transmission. Consequently, several nonlinear mitigation techniques have been proposed in the recent era, of which, digital backward propagation (DBP) [10], maximum likelihood sequence estimation (MLSE) based nonlinearity mitigation [17], spectral inversion [18][19][20], phase conjugated twin waves [21,22] and perturbation based approaches [23] gained considerable attention. However, the real time implementation of these algorithms is extremely challenging owing to either the high required computational effort or the higher bandwidth consumption. Although computationally complex, DBP has drawn significant attention owing to its capability of mitigating linear and nonlinear impairments simultaneously, provided the channel characteristics are known and the step size is sufficiently small when solving the inverse nonlinear Schrodinger equation (NLSE). Several strategies have been proposed in the literature to reduce the number of required DBP steps and enhancing the performance either by including temporal correlations [27,28] or optimizing the parameters [29]. Nevertheless, the performance of DBP significantly deteriorates in the presence of stochastic impairments like laser phase noise and NLPN [30][31][32]. Moreover, its applicability is limited to single channel systems [33,34].
Apart from fiber linear and nonlinear impairment compensation, digital carrier synchronization has also become an essential component of the coherent receivers, for synchronizing the phase and frequency offsets between the transmitter laser and the local oscillator (LO), eliminating the necessity of a phase locked loop. Several carrier phase estimation (CPE) techniques have been proposed for suppressing the laser phase noise [35][36][37][38][39][40][41][42][43][44]. CPE being a low complex technique is also under wide investigation to compensate the nonlinear phase shift owing to Kerr effect, besides laser phase noise [30,[44][45][46][47]. Investigations were also carried on the combined performance of DBP and CPE, in order to reduce the number of DBP steps per span and there by its complexity [31,48]. It was reported in [30,45], that the considered CPE methods outperform the DBP technique implemented using asymmetric split step Fourier method (SSFM) with one step per span and without any parameter optimization. However, the accumulated ASE noise and the NLPN at high signal powers pose a challenging constraint on the conventional CPE limiting its nonlinear mitigation capability. Moreover, a phase unwrapping function [35] is typically required by CPE, which might increase the probability of cycle slips [49] and error propagation. Furthermore, the commonly employed CPE techniques have low tolerance towards the frequency offset (FO) between the transmitter laser and the LO. Therefore, a separate FO estimation module is necessary [50].
In the recent era, Kalman filtering has gained huge attention in the field of optical communication systems, owing to its potential capability to mitigate several optical transmission impairments simultaneously. The Kalman filter was first developed by R. E. Kalman in 1960. In [51], he presented a new approach to the linear filtering and prediction problems by introducing the state space notation, where the random processes/signals to be estimated are represented as the output of a linear dynamic system perturbed by uncorrelated noise. This approach facilitates recursive computation of the optimal solution and highly reduces the computational effort as compared to the conventional Wiener filter besides eliminating the memory problems. The so called Kalman filter computes the optimal solution recursively in the minimum mean square error (MMSE) sense. While the applicability of the Wiener filter is limited to stationary processes, the Kalman filters can be also applied to the non-stationary processes. An added advantage of the Kalman filter is its extended applicability also to the nonlinear systems through an approximate linearization and the so called filter is known as extended Kalman filter (EKF). This has attracted the Kalman filters for numerous real-time applications in the fields of navigation, radar, mobile communications, speech signal processing and so forth. Currently, EKF is under wide investigation in coherent optical communication systems for tracking and mitigating linear and nonlinear phase noise, amplitude noise, phase and frequency offsets as well as polarization de-multiplexing [12,13,31,[52][53][54][55][56][57][58][59][60][61][62][63][64][65][66].
Moreover, enhancing the performance obtained from EKF by incorporating with other existing techniques like DBP has also been studied in [31,58,59]. EKF requires only a few complex multiplications to recover one data symbol. Besides the advantage from low complexity, it further offers other benefits including faster convergence, joint tracking and compensation of fiber impairments. Therefore, it is worth discussing and reviewing the applications of Kalman filters for coherent optical communications in a nutshell.
This chapter is organized as follows: In Section 2, we discuss the principles of Kalman filter by describing the state space notation and the recursive equations. We further present some variations of the Kalman filter, namely, EKF and adaptive Kalman filter (AKF). Section 3 details the applications of EKF for coherent optical communications. We illustrate how to employ Kalman filtering for the joint tracking of several optical transmission impairments by formulating the state space model (SSM) and detailing the working principles. We also describe our numerical model and present the results to justify the theoretical findings. Finally, the chapter is concluded with a note on the key points, in Section 4.

The Kalman filter
A Kalman filter is an optimal recursive linear MMSE estimator that estimates the state of a linear dynamic perturbed by noise. Since the true state of the system is not observable, instead we obtain the measurements or observations that are corrupted by noise. Now, the goal of the Kalman filter is to obtain an optimal estimate of the unknown state from the noisy observations recursively. The stochastic process under estimation is modeled by a state space model (SSM) which facilitates the recursive nature of the Kalman filter. In the following, we present the general framework of the Kalman filtering and also discuss briefly the principles of the EKF and AKF.

Principles of Kalman filter
Consider a discrete-time, linear, time varying system in the state space notation, given by Eqs. (1) and (2). Eq. (1) describes how the true state of the system evolves over time and is known as the state or the process equation. Eq. (2) describes how the measurements are related to the states and is known as measurement or observation equation. Here, k denotes the time instant, x k and y k denote the state vector and the measurement vector, respectively. F k denotes the state transition matrix that relates the states at the time instances k and k -1, in the absence of process noise w k . H k denotes the measurement matrix that relates the states to the measurements in the absence of measurement noise n k . The process and measurement noise vectors w k and n k are assumed to be zero mean white Gaussian noise processes with co-variance matrices Q k and R k , respectively. It is also assumed that the initial state x 0 at time instant 0, is a Gaussian random vector. Given the SSM and these assumptions, the objective of the Kalman filter is to obtain a linear MMSE estimate of x k based on the observations {y 1 , y 2 ,…y k }. The solution corresponds to the conditional mean [67] as given in Eq. (3). Here, E[Á] denotes the expectation operator.
The Kalman filter computes the optimal state recursively, following a predictor-corrector structure, where a prediction is computed prior to the availability of the observation at current time instant k and updates the prediction when the observation at time instant k is available. Throughout this Chapter, we follow the typical notation convention for the Kalman filter equations: any variable with subscript k|k -1 denotes prediction or apriori estimate, and any variable with subscript k|k or simply, denotes the updated or aposteriori estimate. During the prediction step, the Kalman filter makes the best guess about the system's state based on its dynamics, prior to the availability of the current observation. The state prediction denoted byx kjkÀ1 is given in Eq. (4). The uncertainty associated with the prediction is given by the apriori error covariance matrix P k|k-1 , as in Eq. (5). Under the given assumptions and initial conditions, the conditional probability density function (pdf) p(x k |y 1 ,y 2 ,…,y k-1 ) is also Gaussian, where the apriori state estimatex kjkÀ1 and the apriori error covariance P k|k-1 , reflects the mean and variance of the distribution, as given in Eq. (6). Here, N denotes normal or Gaussian distribution.
pðx k jy 1 , y 2 , …, y kÀ1 Þ $ Nðx kjkÀ1 , P kjkÀ1 Þ During the update step, when the new observation at time k is available, the optimal estimate is computed as a linear combination of the prediction and the new information available from the current measurement weighted by an optimal weighting matrix known as Kalman gain. The update equations can be summarized in Eqs. (7)- (10). The innovation denoted by v k , can be interpreted as the new information that is available in the observation y k relative to all the past observations up to time instant k -1. It is computed as the difference between the actual and the predicted observationŷ kjkÀ1 , and is given in Eq. (7). The Kalman gain, denoted by K k , determines the extent up to which the innovation should be taken into account in updating the apriori state estimate and is computed according to Eq. (8). Here, H denotes the Hermitian operator. The updated or aposteriori state estimatex kjk , and the aposteriori error covariance P k|k , are computed as given in Eqs. (9) and (10), respectively. The aposteriori pdf p(x k |y 1 ,y 2 ,…,y k ) is also Gaussian distributed with mean and variance given by the aposteriori state estimatex kjk and the aposteriori error covariance P k|k , respectively, as given in Eq. (11). Thus, the Kalman filter propagates the first and second order moments of the state distribution recursively for computing the optimal state estimate.

Extended Kalman filtering
In Section 2.1, we addressed the problem of estimating the unknown state of a linear dynamic system from noisy observations. Now, we consider the filtering problem for nonlinear system dynamics (either the process or observation model or both being nonlinear). The Kalman filter solution can be adopted for the nonlinear dynamic systems through an approximate linearization procedure and the resulting filter is known as EKF. Consider a nonlinear dynamic system described by the SSM given in Eqs. (12) and (13). Here, f k (Á) and h k (Á) denote the nonlinear state transition function and the measurement function, respectively. x The nonlinear system dynamics can be linearized through a first order Taylor approximation at each time instant, around the most recent state estimate. This forms the basic idea of EKF. Let, A k and B k be the Jacobian matrices of f k (Á) and h k (Á), respectively, and are computed according to Eqs. (14) and (15). Under the given assumptions and, the initial conditions as discussed in the earlier section, the EKF recursive equations can be summarized in Eqs. (16)- (20).

Adaptive Kalman filtering
The Kalman filter computes the optimal solution, provided the process noise and measurement noise covariances, Q k and R k , respectively, are known apriori. However, in practice, a precise knowledge about the noise statistics might not be available. The Kalman gain K k takes into account the noise covariances, Q k and R k , to determine the extent of reliability between the predicted statex kjkÀ1 and the innovation v k . Therefore, a poor knowledge of the noise statistics might significantly degrade the filter performance and even leads to divergence. To overcome these difficulties, an adaptive approach can be followed to adaptively estimate the noise covariances from the noise samples, (for example, the innovation sequence) that are generated during the Kalman recursions at each time instant. This leads to the adaptive Kalman filtering. The different approaches for adaptive filtering are classified into four types: Bayesian, maximum likelihood, correlation and covariance matching methods [68]. Here, we discuss the approach based on covariance matching [68,69] for adaptive estimation of noise statistics. The basic idea behind this approach lies on the fact that for an optimal filter, the theoretical covariance of the innovation v k , denoted by S k , given in Eq. (22) should be consistent with the empirically estimated covariance given in Eq. (23). Here, m denotes the window size to provide statistical smoothing.
Since the Kalman gain K k depends on the ratio of the process and measurement noise covariances Q k /R k , rather than on their individual values, if either of Q k or R k , is known, the other can be adaptively estimated by satisfying the condition for covariance matching, given in Eq. (24). When Q k is known, R k can be directly estimated from Eq. (24). Alternatively, when R k is given, Q k can be estimated by a scaling procedure to improve the robustness of the filter. The basic idea behind this scaling method is that if the estimated covariance of v k , on the right hand side of Eq. (24), is much larger than the theoretical covariance, then Q k (please note that P kjkÀ1 ¼ should be increased to bring the theoretical covariance closer to the estimated one and vice-versa. Therefore, Q k can be adaptively updated in order to balance any deviations between the theoretical and estimated innovation covariance by considering a scaling factor α k , as given in Eq. (25). The estimate of Q k , denoted byQ k , is given in Eq. (26).
In the case of EKF, the same procedure can be followed for adaptive estimation of noise covariances, by replacing the measurement matrix with the Jacobian matrix.
3. Kalman filtering for coherent optical communications 3.1. Kalman filtering for carrier phase and amplitude noise estimation (CPANE) Digital carrier phase estimation (CPE) has become an essential component of coherent optical receivers to recover the carrier phase perturbed by laser phase noise arising from the transmitter laser or LO [35][36][37][38][39][40][41][42][43]. Several CPE techniques have been developed in the literature based on feedback [39,40] or feed forward loops [35][36][37]. Depending on how the data phase is wiped off, they can be further classified into decision directed (DD) [35,40,42,46] or non-decision directed (NDD) methods [39,41,43]. NDD methods like Viterbi-Viterbi [41] CPE has gained high attention due to its ease of implementation. However, it employs m-th power scheme to remove the data modulation and therefore, are only better suited for QPSK systems. However, for higher QAM systems, DD-CPE methods exhibit better performance compared to NDD CPE methods [35,42].
Apart from tracking the carrier phase, CPE being a low complex technique, can also be employed for compensating the nonlinear phase shift stemming from the Kerr nonlinear effects [30,[44][45][46][47]. However, the nonlinear mitigation performance of CPE is limited in the presence of ASE noise and at high launch powers. Moreover, a phase unwrapping function is typically required for CPE that might increase the probability of cycle slips [35,49]. Addressing these problems, we have proposed a CPANE algorithm using EKF in [12,53] for the joint mitigation of linear and nonlinear phase noise as well as ASE induced phase and amplitude distortions. Unlike CPE, EKF-CPANE estimates a complex quantity, and therefore, no argument function is required which eliminates the ambiguity associated with multiples of 2π and consequent cycle slips.
Kalman filter based CPE has been introduced and numerically verified in [52]. From the numerical results, it was reported that the Kalman based phase estimation combined with DD equalizer in a feedback configuration outperforms the conventional CMA based approach [52]. CPE based on EKF was demonstrated and verified experimentally for QPSK and 16-QAM systems in [57]. In [55], EKF has been investigated for characterizing the laser phase and amplitude noise. EKF based carrier synchronization has also been verified experimentally, in combination with expectation maximization (EM). A carrier recovery scheme based on block estimation process with Kalman filter has been demonstrated in [56]. This approach was verified experimentally for 16 and 64-QAM signals. However, these Kalman filter based approaches estimate an argument which involves sine and cosine functions, computation of the Jacobian matrix and also require matrix multiplications and inversions, which increases the computational complexity. The proposed method in [12,53], estimates a complex quantity accounting also for the phase and amplitude distortions arising from the ASE noise in addition to the carrier phase. The variables in the SSM reduce to scalars and therefore, the vectors and matrices are reduced to scalars which will ease the computational effort. In the following, we first describe the general principles of CPE. Later, we explain our proposed CPANE algorithm illustrating its principles and implementation details using EKF.

Principles of CPE
Consider an m-ary QAM received signal on single polarization, which is sampled and compensated for linear impairments. Assuming perfect linear equalization, the k-th input signal to the CPE can be written as in Eq. (27). Here, r k denotes the k-th input signal to CPE, a k denotes the transmitted symbol, and n k denotes the collective amplified spontaneous emission (ASE) noise which is assumed to be white Gaussian process. θ k denotes the phase noise arising from the laser linewidth effects and fiber nonlinearity, which is typically modeled as a Wiener process and is given in Eq. (28). Figure 1 (a) describes the input signal model to CPE. It can be seen that after a k is rotated by phase noise θ k , n k further adds additional phase noise n 0 k and amplitude noiseñ k . The objective of CPE is to estimate the phase noise θ k , and derotate the received signal r k , in order to recover the transmitted symbol a k , as given in Eq. (29) and Figure 1 (b). However, since the CPE targets at estimating an accurateθ k , the recovered transmitted symbolâ k , still suffers from the residual phase noise or amplitude noise or both. For more details, please refer [12].

Principles of CPANE
The effects of n k , as discussed in Section 3.1.1, can be taken into account by reformulating Eq. (27) as given in Eq. (30) which forms the input signal to CPANE. Here, r k is modeled as the transmitted symbol a k being rotated by a complex quantity ψ k , that considers the effects of both phase and amplitude noise in its real and imaginary parts, respectively, as given in Eq. (30). The objective of CPANE is to recover θ k more accurately, by estimating the complex quantity ψ k . The recovered transmitted symbolâ k is given in Eq. (32). Since ψ k takes into account, both the phase and amplitude distortions,â k can be recovered more accurately by employing CPANE compared to CPE, as depicted in Figure 1 (b). Moreover, unlike CPE, CPANE eliminates the necessity of phase unwrapping function.

EKF-CPANE for joint mitigation of phase and amplitude noise
As discussed earlier, CPANE algorithm can be employed for the joint mitigation of phase and amplitude noise. However, it requires a reliable tracking of the complex quantity ψ k , which can be accomplished by an EKF. The required SSM for the EKF can be formulated using Eqs. (33) and (34). Eq. (33) represents the state or process equation that describes the time evolution of ψ k . Eq. (34) represents the observation equation that describes the relation of the states ψ k to the observations r k . Eq. (34) is similar to Eq. (30), however, for consistency of the filter, the measurement noise m k has been taken into account. Here, all the variables in the SSM are scalar quantities. Comparing to the standard SSM for EKF described in Section 2.2, it can be noted that the state transition is identity and the measurement matrix is the transmitted symbol, a k , for simplicity, we call it measurement weight (MW), since it is a scalar. The EKF recursive equations can be derived analogously by relating the SSM to the standard SSM of EKF discussed in Section 2.2. Since the MW a k is required to compute the update equations, which is not known apriori, EKF-CPANE is DD. The required decisions of a k , denoted by d k are obtained by de-rotating r k with an average of the past updated estimatesψ k over a window length of N, as given in Eq. (35). For more details on the prediction and update equations of EKF-CPANE, please refer [12]. Figure 2 depicts the schematic of the EKF-CPANE algorithm, illustrating that the predictionψ kjkÀ1 is the delayed version of the past updated estimate and the current updated stateψ kjk is the linear combination of the predictionψ kjkÀ1 and the innovation v k weighted by the Kalman gain K k . The process of making the required decisions for the update step has also been illustrated in Figure 2.

Numerical analysis of EKF-CPANE
The performance of EKF-CPANE algorithm for mitigating the laser phase noise, fiber nonlinearity besides the ASE induced phase and amplitude distortions has been verified through numerical simulations on single channel systems in [12,53] and multi-channel systems in [54]. Here, we briefly discuss the numerical model and present a few simulation results reproduced from [12], so that the flow of the readers is not interrupted. The numerical model of polarization multiplexed (PM) m-QAM coherent transmission system including a DSP module at the receiver, is depicted in Figure 3. Here, we consider the PM-m-QAM transmitter with m = 16 and 64, operated at 28 and 18.667 GBaud, respectively. These signals are transmitted through a standard single mode fiber (SSMF) link at different launch powers. The SSMF has the following parameters: attenuation coefficient (α) = 0.2 dB/km, dispersion coefficient (D) = 16 ps/nm-km, and nonlinearity coefficient (γ) = 1.2/W-km. The span length of SSMF is 80 km and a number of 12 and 6 spans have been considered for 16 and 64 QAM, respectively, yielding a total transmission distance of 960 and 480 km. The span losses are compensated by an EDFA with a gain of 16 dB and noise figure (NF) of 4 dB. For simplicity, PMD has been neglected in this study. At the receive end, we employ a dual polarization coherent receiver which is followed by a DSP module. The laser linewidth of the LO has been set to 100 kHz. After coherent detection, the signals are re-sampled to twice the symbol rate and are followed by linear compensation. Then the signals are further down sampled to the symbol rate and are further processed by the EKF-CPANE for mitigating linear and nonlinear phase noise besides amplitude noise.   feedforward DD-CPE [46], feedback DD phase locked loop (DD-PLL) [36] and a NDD universal CPE (U-CPE) [39]. The noise covariances for EKF-CPANE, the tap length or step size for DD-CPE, DD-PLL and U-CPE are set to optimize the performance.
The bit error rate (BER) performance of the considered algorithms is evaluated and a Q-factor is computed as 20log 10 erfcinv(2*BER). The Q-factor vs. launch power curves for 16-QAM and 64-QAM are depicted in Figure 4(a) and (b), respectively. It can be seen that EKF-CPANE exhibits better performance compared to DD-CPE, DD-PLL and U-CPE in both linear and nonlinear regimes. This performance enhancement is better visible compared to the DD-CPE method. For PM-64-QAM, it can also be seen that the DD-CPE experiences cycle slips occurring through the error propagation of wrong decisions which can be seen in Figure 4(b) at launch powers ranging from À2 to 1 dBm [12]. Since the performance of DD algorithms strongly depends on the pre-decisions made by the algorithm, we study the impact of ideal error free decisions on their performance by replacing the pre-decisions d k with the true data symbols a k . The algorithms with the ideal case are denoted by IEKF-CPANE, IDD-CPE and IDD-PLL. It can be seen from Figure 4(a) and (b), that the IEKF-CPANE shows significant performance enhancement and better tolerance towards linear and nonlinear phase noise as well as amplitude noise, compared to IDD-CPE and IDD-PLL. Unlike EKF-CAPNE, no notable improvement can be obtained for the DD-CPE and DD-PLL between their practical and ideal cases. Although, the ideal case, where the true symbols a k are already known, is not possible in practice, it should be noted that the performance of EKF-CPANE can be further improved by reducing the number of decision errors, which will be further discussed in the next Section 3.2.

EKF and DBP for fiber nonlinear mitigation
In Section 3.1, we have described how the EKF can be employed for the joint mitigation of phase and amplitude noise. From the numerical results discussed in Section 3.1.4, it can be concluded that the EKF-CAPNE algorithm shows promising results in mitigating the linear   [12]. and nonlinear phase noise as well as amplitude noise simultaneously besides less computational effort. Although, EKF-CPANE outperforms several other considered CPE methods, the effectiveness of EKF-CPANE in mitigating fiber nonlinear effects can be further enhanced by reducing the number of errors in the pre-decisions d k . We have proposed a weighted innovation approach (WIA) in [12], where the innovation is computed as a weighted combination of the two nearest likely decisions. Although a gain of ≈ 0.3 dB in the Q-factor can be obtained compared to conventional EKF-CPANE, in the linear regime, no notable improvement can be seen in the nonlinear regime. On the other hand, DBP has emerged to be an effective technique in mitigating linear and nonlinear impairments simultaneously, provided the channel parameters are known a-priori and the step size is sufficiently small. However, DBP can compensate only the deterministic impairments of self-phase modulation and its performance deteriorates significantly in the presence of stochastic impairments like laser phase noise, ASE and NLPN. Moreover, the required huge computational effort keeps it far away from real-time implementation. Nevertheless, by employing a few DBP steps prior to EKF-CPANE would yield an enhanced tolerance towards nonlinearities since DBP is well capable of mitigating deterministic impairments and EKF takes into account the stochastic nature of ASE noise and NLPN. By partially compensating fiber nonlinear effects employing few DBP steps prior to EKF, would result in improved pre-decisions and thereby facilitates the residual compensation of nonlinearities along with amplitude and phase noise effectively. These theoretical findings are verified through numerical simulations on both single [31] and multichannel systems [58].
In [31], it was reported that the EKF-CPANE outperforms the asymmetric split step Fourier method (ASSFM) based one step per span (OSPS) DBP with optimized nonlinear co-efficient γ (ODBP), for single channel systems, for transmission on both SSFM and non-zero dispersion shifted fiber (NZ-DSF). A detailed investigation has also been carried out on the combined performance of DBP and EKF-CPANE with an analysis on the influence of the nonlinear coefficient and the step size of DBP when employed prior to EKF-CPANE. The numerical model employed in this study is similar to the one discussed in Section 3.1.4, with a few changes in the parameters of NF being 5 dB and the linewidth of LO being 500 kHz. The influence of DBP step size on the combined performance of DBP and EKF-CPANE for both SSMF as well as NZ-DSF transmission is illustrated in Figure 5(a) [31]. Here, OCDBP denotes the optimized DBP which has a nonlinear coefficient different from ODBP when employed prior to EKF. A worth noting result is that at a launch power of 3 dBm and a transmission distance of 960 km, a gain of 1 dB in the Q-factor can be obtained by employing 0.3 DBP steps per span prior to EKF-CPANE, for both SSMF and NZ-DSF transmission. At the expense of additional computational effort, the deployment of a few DBP steps prior to EKF-CPANE further enhances its performance trading off to complexity.
For the case of multi-channel systems, also, a detailed analysis has been performed in [58], on the combined performance of DBP and EKF for mitigation of inter and intra channel nonlinearities besides phase and amplitude noise. Here, the DBP is employed by considering the temporal correlations between the neighboring signal samples and is termed as correlated DBP (CDBP) [27,28]. This approach will improve the accuracy in computing the nonlinear phase shift and there by enhances the nonlinear mitigation performance. Since the optimization of nonlinear coefficient plays a vital role on the performance of DBP, we proposed an amplitude dependent optimization (AO) [58] of the nonlinear coefficient, according to the discrete amplitude levels present in the higher order modulation formats like 16-QAM. The combined performance of AO-CDBP and EKF-CPANE for WDM systems with varying number of channels has been investigated in [58]. Analogous to the single channel systems, the combined performance of AO-CDBP and EKF yields an improved performance also for the WDM case. However, with increasing impact of the cross phase modulation (XPM) as the number of channels increase, the gain obtained from their combined performance starts vanishing which can be observed in Figure 5(b).

EKF for mitigation of nonlinearities in dispersion managed links
Since the advent of coherent detection and DSP for coherent optical receivers, CD can be effectively compensated by digital equalization in the electric domain ad thereby, eliminating the need for dispersion compensating fibers (DCF). However, nonlinear mitigation in the dispersion managed (DM) links is also vital in order to upgrade existing links. Although, the computational complexity of DBP is quite high, for DM links, the DBP algorithm can be simplified by assuming that the nonlinear behavior repeats itself every span and therefore, the total nonlinearity after N spans of transmission can be approximated to N times the nonlinearity from a single span [70]. This is termed as distance folded DBP [70] and it reduces the complexity by a factor of N assuming the step size of DBP is equal to the span length and the span length is assumed to be constant. Assuming the dispersion is fully compensated in each span, only the nonlinear term in the nonlinear Schrödinger equation (NLSE) can be solved in the time domain avoiding the Fourier and inverse Fourier transformation (FFT/IFFT) pairs which reduces the computational cost of DBP drastically. We call this approach single step nonlinearity mitigation (SSNL). Similar to the unmanaged links as discussed in the earlier section of 3.2, we investigated the combined performance of SSNL and EKF-CPANE for mitigating the fiber nonlinearity in DM links [59].
The numerical model of PM-16-QAM coherent transmission system over DM link [59] is depicted in Figure 6. Here, a fully compensated periodical DM link with several spans has been considered. Each span consists of 80 km of SSMF and 17 km of dispersion compensating fiber (DCF). The SSMF has the following parameters: α = 0.2 dB/km, D = 17 dB/nm-km, γ = 1.2/W-km. The parameters of DCF are given by: α = 0.5 dB/km, D = À80 dB/nm-km and γ = 5/W-km. In this study, the input power to DCF was set to half of the input power to SSMF. Therefore, the gains of EDFA1 and EDFA2 are adjusted accordingly, to compensate the span losses. The NF of both the EDFAs are set to 4 dB. As described earlier, after coherent detection, the signals are further processed by the SSNL and EKF-CPANE algorithms for mitigating fiber nonlinearities. It has been reported in [59], that the combined performance of SSNL and EKF yields an improved tolerance towards nonlinearities of up to 2 dB for a transmission distance of 1200 km and at a BER of 2*10 À2 . Further, their combined performance increases the transmission reach by ≈ 250 km at a launch power of 3 dBm and at a BER of 2*10 À2 as depicted in Figure 7.

Kalman filtering for polarization de-multiplexing
An effective way to double the transmission capacity is to employ PDM which allows the transmission of two information signals simultaneously on the orthogonal polarization states of the same optical carrier wave. However, due to fiber birefringence, the state of polarization is not preserved during the propagation on the fiber that leads to crosstalk upon the receipt of the signal. In coherent receivers, CMA [15] or MMA [16] is commonly employed in order to align the polarization states and recover the transmitted signal fully. However, CMA or MMA suffer from the drawbacks of low convergence speed and singularity problem [71]. Moreover, a separate phase estimation scheme is required to track the laser phase noise. Since the Kalman filter allows simultaneous tracking of several state variables provided a precise SSM, the Kalman filter and its variations including radius directed linear Kalman filter (RD-LKF), EKF and UKF are widely investigated for tracking the complex elements of the Jones matrix along with the carrier phase [61][62][63].

RD-LKF, EKF and UKF for joint tracking polarization state and phase noise
An EKF has been proposed in [61] for joint tracking of the polarization and phase noise. It has also been reported that the EKF shows faster convergence than the conventional approach based on CMA and VV-CPE [61]. However, the variables in the state vector are restricted to real values, which would lead to singularity problems or divergence of the filter [63], besides increasing the dimensions of the vectors and matrices in the Kalman recursive equations. A polarization state tracking scheme using Kalman filter, which is immune to phase/frequency offset, has been introduced in [62], and is termed as RD-LKF. Although, it shows faster convergence compared to CMA, this method needs significant modifications for applying to higher order QAM. Moreover, it is not possible to track the carrier phase simultaneously with the polarization state. The joint tracking of polarization state and carrier phase using EKF has been experimentally verified in [57]. A reduced SSM using UKF has been introduced in [63], which facilitates the joint tracking of polarization state and phase noise. Here, the variables of the state vector are considered to be complex valued. This approach exhibits better performance compared to EKF at high OSNRs at the expense of additional computational effort.

Adaptive cascaded Kalman filtering (A-CKF) for polarization de-multiplexing with simultaneous tracking of phase and amplitude distortions
A cascaded Kalman filtering (CKF), a series of EKF and linear Kalman filtering (LKF) for joint tracking of phase and amplitude distortions besides polarization state, has been proposed in [13]. By splitting up the conventional SSM into linear and nonlinear SSM, the inaccuracies in the linearization of the SSM as a whole can be reduced and thus CKF exhibits enhanced performance besides no increased computational cost compared to the approaches like UKF [63] and radius directed (RD) LKF [62]. Since the optimal performance of the Kalman filter depends on the noise covariances, we proposed an adaptive CKF (A-CKF) [13] to adapt the process noise covariance recursively using the covariance matching method as described in Section 2.3.

Principles of A-CKF
The transmitted and received signal in the presence of phase noise and polarization rotation can be related as given in Eq. (36). Here, t k , r k and n k denote the transmitted and received signal and ASE noise in dual polarization, respectively. J k denotes the Jones matrix, θ k denotes the phase noise and α denotes the loss factor. Assuming negligible, the inverse of the Jones matrix can be described as in Eq. (37) and the elements of the Jones matrix satisfy J yy ¼ J Ã xx and J yx ¼ ÀJ Ã xy [15]. From now on, for simplified notation, we omit the time variable k in this section. The observation model in Eq. (36) can be rewritten in dual polarization as given in Eq. (38). Here, the subscripts x and y denote the x and y polarizations, respectively. The conventional approach to track the phase and the polarization effects using EKF, the state vector consists of the parameters, a, b, c, and d. However, we reduce the dimensions of the state vector and also the other matrices in the SSM by considering the complex elements in the state vector given by SðkÞ ¼ ½J xx J xy φ. Moreover, we also split up the nonlinear observation model given in Eq. (38), into a nonlinear and linear observation model, where we employ an EKF-CPANE for the joint tracking of phase and amplitude distortions and an LKF for tracking of the complex elements in the Jones matrix. The process noise covariance has been adaptively updated by employing the covariance matching method as described in Section 2. For more details on the A-CKF algorithm, please refer to [13].
Numerical investigations on both back-to-back (BTB) and transmission scenarios, have been carried out in [13], on the variations of the Kalman filter including EKF, UKF, CKF and A-CKF, for tracking the polarization state and phase noise jointly and are compared to the conventional MMA algorithm. Since the MMA can track only the polarization state, it is accompanied by a DD-CPE algorithm for the phase noise mitigation. It can be concluded from [13] that the CKF and A-CKF outperform the rest of the considered algorithms with a better tolerance towards polarization rotations, phase and amplitude noise. This can be attributed to the decrement in the inaccuracies through the linearization of the whole SSM in CKF/A-CKF, compared to EKF and UKF. The benefit from the adaptive computation of process noise covariance compared to the CKF can be observed at rotation angular frequencies of 400 Mrad/s and higher in the BTB case and at higher launch powers of 5 dBm in the transmission case [13].

Kalman filtering for joint compensation of phase and frequency offset
Apart from digital equalization, carrier synchronization is also vital to mitigate the phase and frequency offsets between the transmitter laser and free running LO. Since the CPE methods have low tolerance towards FO, which may go as high as AE5 GHz, a separate FO estimation (FOE) is required. Consequently, several FOE algorithms have been proposed in the literature that are either based on the phase increments between adjacent symbols [72] or spectrum based methods [73]. These methods are either not accurate for higher order QAM systems or computationally complex.

LKF and EKF for FO estimation
A novel FOE algorithm using Kalman filtering have been proposed and numerically verified in [60]. The simulation results in [60] concludes that the Kalman filter can achieve faster convergence and outperforms the conventional FO estimation at low OSNR. In [64], FOE schemes based on blind and training data, using LKF and EKF have been proposed for QPSK systems. These Kalman based FOE algorithms are evaluated both numerically and experimentally, and are compared to FFT based FOE methods. The investigations in [64] report that the training data based Kalman FOE methods show better accuracy in estimating the FO in case of fewer symbols and high OSNR, compared to FFT based methods. However, a separate phase estimation has to be carried out after FO compensation.

Two stage EKF for joint compensation of FO, phase and amplitude noise
The Kalman based FOE algorithms proposed in [60,64] can compensate only for the FO and therefore, the carrier phase has to be recovered separately after FO compensation. In [65], a two stage EKF method based on training data has been proposed for joint tracking of FO, phase and amplitude noise. In the first stage, a coarse estimate of FO is obtained using a set of training data symbols following the training data scheme proposed in [64]. In the second stage, CPANE algorithm has been employed to jointly compensate for the residual FO, phase and amplitude noise.

Principles of two stage EKF
After linear equalization, the received signal on single polarization, with frequency and phase offset can be represented as given in Eq. (39). Here, r k and a k denote the received and transmitted symbol, respectively, at the time instant. w denotes the FO between the transmitter laser and the LO. T s denotes the symbol duration. Ø k and n k denote the phase noise and ASE noise, respectively. In order to obtain the measurement for FO, the first step is to wipe off the data phase which is performed by employing training data. Then the phase difference between the adjacent symbols [64] is computed, which gives the measurement of FO denoted by m k , in Eq. (40). Here, v k is given by Ø k -Ø k-1 , and follows a Gaussian distribution. By considering the observation model given in Eq. (40) for EKF, a coarse FO estimation is performed in the first stage using a set of training data sequence. The input signalr k to the second stage after coarse FO estimation is given in Eq. (41). Here, the CPANE algorithm is employed to compensate the residual FO, phase noise and ASE induced phase and amplitude distortions. Figure 8 illustrates the basic structure of this two stage EKF [65]. A similar two stage model using LKF has also been evaluated in [65] and compared to EKF.
r k ¼ a k e jðwkTsþ ∅ k Þ þ n k (39) The BER vs OSNR curves for LKF and EKF after the 2nd stage, using 200 and 500 training data symbols, for a FO of 1 GHz, are depicted in Figure 9 [65]. It can be concluded from [65], that both    LKF and EKF show faster convergence irrespective of the number of training data symbols utilized in the first stage. However, since the EKF estimates a complex quantity, it facilitates in compensating also for the amplitude noise and therefore, outperforms LKF. Moreover, as discussed earlier, EKF does not require any angle operations unlike LKF, and thereby the additional few computations required by the EKF compared to LKF can be sought to be compensated with the additional benefit of better tracking capability.
This two stage EKF model has been extended in [66] to compensate also for the fiber nonlinearity in addition to FO, phase and amplitude noise. The first stage is similar and compensates FO coarsely, as discussed earlier. In the second stage, the total phase noise to be estimated comprises of both laser phase noise and fiber nonlinearities. The EKF-CPANE algorithm is employed for tracking the residual FO and the total phase noise in addition to amplitude noise. From the numerical analysis, it was reported in [66] that compared to LKF, the maximum possible transmission reach can be increased by an additional 500 km using EKF, at a BER of 2.4*10 À2 .

Conclusions
We have discussed in detail on how to exploit the potential of Kalman filters for the joint mitigation of several fiber optical transmission impairments in coherent optical transmission systems. Various Kalman based approaches for tracking carrier phase and frequency offset, polarization state have been reviewed. The CPANE algorithm and its implementation details using EKF for joint mitigation of linear and nonlinear phase noise as well as amplitude noise have been illustrated in detail. It is also verified that the combination of DBP and EKF enhances the nonlinear mitigation performance, at the expense of few DBP steps. A cascaded structure using LKF and EKF is illustrated for tracking the polarization state and carrier phase besides amplitude noise, simultaneously. A two stage EKF model for simultaneous tracking of FO, phase and amplitude noise is also discussed. From the discussed numerical verifications, it can be concluded that the Kalman filter based approaches for tracking the optical transmission impairments outperforms the conventional methods in coherent optical communication systems, with faster convergence, better tracking ability and more tolerance towards the optical transmission impairments. Since the Kalman filter is an optimal recursive MMSE estimator, with its attractive properties of hardware efficient implementation feasibility, less computational effort as well as memory requirements, it seems to be an essential component of future coherent optical receivers.

Author details
Lalitha Pakala* and Bernhard Schmauss *Address all correspondence to: lalitha.pakala@fau.de Institute of Microwaves and Photonics (LHFT) and Erlangen Graduate School for Advanced Optical Technologies (SAOT), University of Erlangen-Nuremberg, Erlangen, Germany