Full-Field Detection with Electronic Signal Processing

The rapid growth in broadband services is increasing the demand for high-speed optical communication systems. However, as the data rate increases, transmission impairments such as chromatic dispersion (CD) become prominent and require careful compensation. In addition, it is proposed that the next-generation optical networks will be intelligent and adaptive with impairment compensation that can be software-defined and re-programmed to adapt to changes in network conditions. This flexibility should allow dynamic resource reallocation, provide greater network efficiency, and reduce the operation and maintenance cost. Conventional dispersion compensating fiber (DCF) is bulky and requires careful design for each fiber link as well as associated amplifiers and monitoring. Recently, the advance of high-speed microelectronics, for example 30 GSamples/s analogue to digital converters (ADC) (Ellermeyer et al., 2008), has enabled the applications of electronic dispersion compensation (EDC) (Iwashita & Takachio, 1988; Winters & Gitlin, 1990) in optical communication systems at 10 Gbaud and beyond. The maturity in electronic buffering, computation, and large scale integration enables EDC to be more cost-effective, adaptive, and easier to integrate into transmitters or receivers for extending the reach of legacy multimode optical fiber links (Weem et al., 2005; Schube & Mazzini, 2007) as well as metro and long-haul optical transmission systems (Bulow & Thielecke, 2001; Haunstein & Urbansky, 2004; Xia & Rosenkranz, 2006, Bosco & Poggiolini, 2006; Chandrasekhar et al., 2006; Zhao & Chen, 2007; Bulow et al., 2008). Transmitter-side EDC (McNicol et al., 2005; McGhan et al., 2005 & 2006) exhibits high performance but its adaptation speed is limited by the round-trip delay. Receiver-side EDC can adapt quickly to changes in link conditions and is of particular value for future transparent optical networks where the reconfiguration of the addand drop-nodes will cause the transmission paths to vary frequently. Direct-detection maximum likelihood sequence estimation (DD MLSE) receivers are commercially available and have been demonstrated in various transmission experiments (Farbert et al., 2004; Gene et al., 2007; Alfiad et al., 2008). However, the performance of conventional EDC using direct detection (DD) is limited due to the loss of the signal phase information (Franceschini et al., 2007). In addition, the transformation of linear optical impairments arising from CD into nonlinear impairments after square-law detection significantly increases the operational complexity of the DD EDC. For example, DD MLSE was numerically predicted to achieve 700km single mode fiber (SMF) transmission at 10 Gbit/s but required 8192 Viterbi processor states (Bosco & Poggiolini, 2006).

. Basic principle of full optical-field detection. Black and brown lines represent optical and electrical signals respectively. AMZI: asymmetric Mach-Zehnder interferometer.
where  AMZI is the differential phase of the AMZI. E 1 (t) and E 2 (t) are detected by a pair of photodiodes and electrically amplified to obtain the electrical signals V 1 (t) and V 2 (t): 22 If we choose  AMZI =/2 and small t value such that E(t-t)E(t), signals proportional to the intensity, instantaneous frequency, and phase of the optical field, V A (t), V f (t), and V p (t), can be extracted by signal processing of V 1 (t) and V 2 (t): In practice, asin() in Fig.1 can be neglected given (t)-(t-t)<<1. By recovering the optical intensity and phase, the full optical field can be reconstructed by: This full-field reconstruction module may be implemented using analogue (Ellis & McCarthy, 2006) or digital  devices. In the latter case, ADCs are used to sample and quantize V 1 (t) and V 2 (t).
The configuration in Fig. 1 can be alternatively implemented by Fig. 2, where a pair of balanced photodiodes gives V 1 -V 2 and an additional single photodiode obtains V 1 +V 2 . In the following discussions, we define two new quantities: V -(t)=V 1 (t)-V 2 (t) and V + (t)=V 1 (t)+V 2 (t).

Optimization of V f (t) estimation
Fig. 1&2 depict the basic principle for ideal FFD. In practice, additional components are required in the scheme to overcome the susceptibility to noise and associated noise amplification mechanisms during the full-field reconstruction process. The degradation mechanisms discussed in this subsection are induced in the V f (t) estimation and can be understood from Eq. (3.2), where it is shown that (t)-(t-t) is obtained by the division by the instantaneous received signal power (V + (t)). It is apparent that for the sampling points where the received power is low, this estimation is particularly sensitive to noise. In an OOK-based system where logic data '0' is represented by zero or low power level, this mechanism results in pattern effect with performance degradation for a sequence of consecutive logic data '0's ). This may be ameliorated by reducing the transmitter extinction ratio (ER) for an OOK system at the expense of a small ER-induced penalty, so the ER should be optimized to maximize the system performance . This mechanism also applies to other modulation formats. For example, in a phaseshifted keying system using Mach-Zehnder modulator to encode the phase information, the near-zero intensity due to  phase shift between symbols needs to be controlled, especially in the presence of CD.
Practical pre-amplified optical communication systems usually have an optical power level incident on the photodiodes sufficiently larger than the thermal noise level to ensure that the system is primarily limited by optical amplified spontaneous emission (ASE) noise. However, in FFD, the thermal noise of photodiodes may play an important role in the performance due to the aforementioned low intensity regions. The impact of thermal noise therefore needs to be considered in the overall design of the full-field detector front end. Considering thermal noise of photodiodes, Eq. (3.2) can be re-written as: where n th_1 and n th_2 represent the thermal noise on V 1 (t) and V 2 (t) in Fig. 1 respectively. It can be seen from Eq. (5) that even when <E(t-t/2) 2 >, where <> is the ensemble average, is sufficiently larger than the thermal noise level, the signal-independent thermal noise may have significant impact on the performance for a logical data '0' or the case that the phase difference (t)-(t-t) is small. As discussed above, the use of a lower ER can alleviate this problem (also that caused by the ASE noise) at the expense of back-to-back receiver sensitivity. In practice, a DC bias can be added to V + (t) before division to mitigate this thermal noise induced effect without the sacrifice of ER. This DC bias may increase the value of the denominator in Eq. (5), which for the signal sequence with low optical intensity, would significantly reduce the impact of thermal noise albeit at the expense of a slight distortion in the reconstructed frequency. Note that in most practical systems, a DC bias would be required in any case in order to accommodate the AC coupling of the receiver. Another effect arising from the thermal noise is attributed to the numerator in Eq. (5) which is approximately linearly proportional to the phase difference (t)-(t-t) and consequently dependent on the DTD of the AMZI. By employing an AMZI with a larger DTD, (t)-(t-t) is increased, which therefore improves the signal to thermal noise ratio. Note that t cannot be increased indefinitely because the derivation is based on the assumption of E(t-t)E(t). The DTD of the AMZI should be designed to a balance between values favouring precise estimation of V f (t) and thermal noise. In practice, a DTD value between 20%-50% of the symbol period would obtain optimal performance.

Optimization of V p (t) estimation
In addition to the design to ensure the accuracy of the V f (t) estimation, the performance of the full optical-field reconstruction also depends on the quality of phase estimation using V f (t), which is found to be degraded by low-frequency amplification . To illustrate the origin of such an impairment, we may take the Fourier transform to relate the estimated phase V p (t) to the estimated frequency V f (t): where  p () and  f () are the spectra of V p (t) and V f (t) respectively. It is clear from Eq. (6) that the low-frequency components of V f (t) dominate phase reconstruction, with a scaling factor of 1/. Therefore, any noise or inaccuracy in the low-frequency components of the estimated V f (t) will accumulate and eventually limit the performance. This suggests that the low-frequency components of V f (t) should be minimized, which can be achieved by using a high-pass electrical filter before the integrator in Fig. 1&2. Fig. 3 depicts the full configuration of full-field reconstruction. A DC bias is added to the detected signal intensity V + (t) to accommodate the AC coupling of the receiver and to enhance the robustness to the thermal noise. The amplitude of the optical field V A (t) is equal to the square root of the rebiased version of V + (t). A high-pass filter is placed in the phase estimation path to suppress the low-frequency amplification. The optical phase V p (t) is obtained by integrating the frequency V f (t) and employed to reconstruct the full optical field V full (t) using an exponential function and a multiplier.

Electronic signal processing techniques for dispersion compensation
The recovered optical field allows for subsequent dispersion compensation using electricaldomain signal processing techniques, as shown in Fig. 3. The full-field approaches offer a compromise between conventional cost-effective DD EDC that lacks phase information and is thus limited in performance, and coherent-detection based EDC (see Fig. 4) that has better performance but is expensive, requiring a narrow linewidth laser, two 90 0 hybrids, four pairs of balanced photodiodes and four ADCs. Although FFD needs additional full-field reconstruction, it avoids complicated estimation of the frequency offset, polarization and phase difference between the signal and the local oscillator that are required in coherent detection. The complexity of the dispersion compensation module in FFD is also comparable to that in coherent detection. Fig. 3. Full configuration of full-field reconstruction with a bias added to V + (t) to mitigate the thermal noise effect and a high-pass filter in the phase estimation path to suppress lowfrequency amplification.

Frequency-domain equalization
By recovering the full optical field, the linearity of the channel is preserved and linear impairments such as CD can be simply compensated by applying their inverse transfer function. For large values of accumulated dispersion, this is optimally applied in the frequency domain. For an analogue signal, CD would be compensated using a dispersive microstripline (McCarthy & Ellis, 2007). In the digital domain (see Fig. 5(a)), this is implemented using frequency-domain equalization where we convert the recovered optical field V full (t) into parallel blocks, take the fast Fourier transform (FFT) for each block, multiply the transformed signal spectrum by the inverse transfer function of CD, take the inverse fast Fourier transform (IFFT) and then convert the blocks into compensated serial time-domain signal. Each block has overlaps in time with its adjacent blocks to allow for the guard interval for CD compensation (see the inset of Fig. 5(a)), whose length should be longer than the memory length of the channel intersymbol interference (ISI). The inverse transfer function of CD is:  where  2 z is the accumulated CD value and  is the frequency. With this technique, only  2 z needs to be controlled, and is expected to match the CD of the actual fiber link. Inset of (a) shows the block-based processing of frequency-domain equalization, with overlaps between adjacent blocks as the guard interval.

Full-field detection feed-forward equalization (FFD FFE)
For small CD values where the memory length of the intersymbol interference is not large, the compensation may be implemented using a time-domain filter with an appropriate response. One implementation is a finite impulse response filter comprising a cascade of tapped and weighted delay lines known as feed-forward equalizer (FFE). FFE is suitable for implementation using either analogue (Haunstein & Urbansky, 2004) or digital circuits. For the digital implementation (see Fig. 5(b)), the digital signal representing the recovered field V full (t), V full (t i ), is processed to give the estimated sequence, b n : where f i/N , -Nm/2iNm/2, is the FFE coefficient with N being the sample number per bit and m being the memory length. The tap weights f i/N are updated by comparing the values of the estimates with the values after decision or the training sequence data (Proakis, 2000): where  is a parameter to control the update speed. a n is the n th decoded data and is replaced by the training sequence during initial channel estimation.

Metric computation
Viterbi decoding Channel estimation

Full-field detection maximum likelihood sequence estimation (FFD MLSE)
In MLSE, rather than compensating for the CD-induced distortion prior to symbol decision, the DSP circuit builds up a "channel model" representing the expected received waveforms for a complete set of transmitted sequences. These stored waveforms are then compared to the actual received waveform and the sequence that most likely results in the received waveform is selected. In practice, the "channel model" can be simplified by assuming a finite channel memory length and the comparison process may be performed using a recursive algorithm proposed by Viterbi (Proakis, 2000). In FFD MLSE (see Fig.  5(c)), the real and imaginary information are both exploited for building up the "channel model" (formally known as channel training) and for calculating the probability that the received waveform matches one of the stored waveforms (formally known as metric computation). Mathematically, the metric of FFD MLSE, PM(a n ), is calculated as : where i represents the samples associated with the bit n (i{n, n+1/2} for two samples per bit). () and () represent the real and imaginary components. a n and p((V full (t i )), (V full (t i ))| a n-m ,…,a n ) are the n th logical data and the two-dimensional joint probability of the full optical field at time t i given the logical data a n-m ,…,a n , respectively. m is the memory length. The initial joint probabilities are obtained using either a histogram or a parametric method. In the histogram method, lookup table is established for p((V full (t i )), (V full (t i ))|a n-m ,…,a n ) with the table size proportional to 2 2q+m+2 at a sampling rate of two samples per bit, where q is the ADC resolution. The complexities of the metric computation and the Viterbi decoding are the same as those of a DD MLSE with the same state number, and are proportional to 2 m+1 and 2 m respectively. In practice, the full expression of metric (10) can be approximated by assuming that the probability distributions for the real and imaginary signals are independent, giving a new metric ): This simplification causes only a slight performance penalty when used with optimized system parameters, but significantly reduces the required lookup table size and the time for lookup table setup and update from 2 2q+m+2 to 2 q+m+3 .
Lookup table based histogram channel estimation is precise, and, to a certain extent, able to mitigate nonlinear impairments which distort the signal in a deterministic manner. However, the required training sequence to obtain the lookup table may be long. On the other hand, parametric channel estimation, where the lookup table is obtained based on the assumption of a distribution for the received samples and the calculation of a few basic parameters for the distribution, can greatly improve the adaptation speed. By recovering the full optical field, p((V full (t i ))|a n-m ,…,a n ) and p((V full (t i ))|a n-m ,…,a n ) can be approximated using Gaussian distribution : where  i,r and  i,q are the means of the real and imaginary tributaries of the signal while  i,r ,  i,q are the variances, all dependent on the logical data a n-m ,…,a n .

Numerical analysis for 10 Gbit/s FFD-based OOK systems
Fig . 6 shows the simulation model implemented using Matlab. Continuous wave light was intensity modulated by a 10 Gbit/s OOK data train using a Mach-Zehnder modulator (MZM). The data train consisted of a 2 11 -1 pseudo-random binary sequence (PRBS) repeated nine times (18,423 bits). 10 '0' bits and 11 '0' bits were added before and after this data train respectively to simplify the boundary conditions. The bits were raised-cosine shaped with a roll-off coefficient of 0.4 and had 40 samples per bit. The extinction ratio (ER) of the modulated OOK signal was set by adjusting the bias and the amplitude of the electrical OOK data. The signal was launched into the transmission link with 80 km SMF per span and -3 dBm signal power.
The SMF had CD of 16 ps/km/nm, a nonlinear coefficient of 1.2/km/W, and a loss of 0.2 dB/km. The split-step Fourier method was used to calculate the signal propagation in the fibers. At the end of each span, noise from Erbium-doped fiber amplifiers (EDFA) was modelled as complex additive white Gaussian noise with zero mean and a power spectral density of n sp h(G-1) for each polarization, where G and h are the amplifier gain and the photon energy respectively. n sp is population inversion factor of the amplifiers and was set to give 4 dB amplifier noise figure (NF). The noise of the optical preamplifier was also modelled as additive white Gaussian noise with random polarization. The launch power into the preamplifier was adjusted by a variable optical attenuator (VOA) to control the optical signalto-noise ratio (OSNR). The pre-amplified signal was filtered by an 8.5 GHz Gaussian-shaped optical band-pass filter (OBPF), unless otherwise stated. The signal after the OBPF was then split into two paths to extract V -(t) and V + (t). The AMZI for the extraction of V -(t) had /2 differential phase shift and differential time delay (DTD) of either 10 ps or 30 ps. The responsivities of the balanced photodiodes and the direct photodiode were assumed to be 0.6 A/W and 0.9 A/W respectively, and equivalent thermal noise spectral power densities were assumed to be 100 pA/Hz 1/2 and 18 pA/Hz 1/2 respectively. These parameters match typical values of commercially available detectors. The optical power incident on the photodiodes was 0 dBm. After detection, the signals were electrically amplified, filtered by 15 GHz 4 th -order Bessel electrical filters (EFs), and down-sampled to 50 GSamples/s to simulate the sampling effect of the real-time oscilloscope. V + (t) was re-biased to allow for the AC coupling of the receiver and to enhance the robustness to thermal noise. The high-pass EF to suppress the lowfrequency amplification was Gaussian-shaped. V -(t) and V + (t) were exploited to reconstruct the optical signal, which was subsequently compensated using frequency-domain equalization, FFD FFE and FFD MLSE. The simulation was iterated seven times with different random number seeds to give a total of 128,961 simulated bits. The performance was evaluated in terms of the required OSNR (0.1 nm resolution) to achieve a bit error rate (BER) of 510 -4 by direct error counting. 128,961 bits were sufficient to produce a confidence interval of [3.510 -4 710 -4 ] for this BER with 99% certainty (Jeruchim, 1984).

System design based on frequency-domain equalization
In this subsection, simulations are performed to verify the important design rules as described in Section 2. The results are based on frequency-domain equalization, but the developed guidelines also apply to FFD FFE and FFD MLSE.

Optimization of optical-field reconstruction
As discussed in Section 2, it is essential to optimize the system to ensure the quality of optical-field reconstruction. Fig. 7 shows the eye diagrams of the signal at a fiber length of 2160 km using frequency-domain equalization. In these figures, thermal noise and fiber nonlinearity were not included. For a larger ER ( Fig. 7(a)), the received value of V + (t) for a sequence of consecutive logical data '0's was so small that any optical noise led to large estimation inaccuracy in V f (t). This inaccuracy contained significant low-frequency content, which was further increased by the low-frequency amplification mechanism. By using a smaller ER, the value of V + (t) for a sequence of consecutive logical data '0's was increased, reducing the estimation inaccuracy of V f (t) and resulting in better compensation performance, as shown in Fig. 7(b). The high-pass EF in the phase estimation path further reduced the low-frequency components of V f (t). As a result, the compensated signal after 2160 km shown in Fig. 7(c) has a significantly clearer eye than those in Fig. 7(a) and (b).  To quantify the performance improvement of the method, Fig. 8(a) depicts the required OSNR for these three cases. The figure shows that by using 12 dB ER and a 0.85 GHz highpass EF, the OSNR transmission limit could be significantly extended, despite the back-toback penalty arising from the reduced ER. At a system length of 2160 km, the required OSNR was around 13.7 dB. It should be noted that whilst the high-pass EF suppressed the impairment from low-frequency amplification, it also introduced distortion to the estimated frequency V f (t). This distortion resulted in the rails of the eye diagrams in Fig. 7(c) being somewhat thicker than those in Fig. 7(b). Clearly, a trade-off exists between the impairment from low-frequency amplification and the distortion. At an ER of 12 dB and 2160 km, the optimized bandwidth of the EF was around 0.85 GHz as shown in Fig. 8(b). Fig. 9(a) shows the required OSNR without (circles) and with (triangles) fiber nonlinearity and the maximum achievable OSNR (squares) for 80 km SMF and -3 dBm signal launch power per span. The ER was 12 dB and a 0.85 GHz Gaussian-shaped high-pass EF was employed. The DTD of the AMZI and the bias of V + (t) were assumed to be 10 ps and 0 V respectively, and photodiode thermal noise was neglected. From the figure, it is shown that including fiber nonlinearity in the transmission simulation resulted in an additional penalty of up to 1.8 dB for system lengths less than 2160 km. On the other hand, the maximum achievable OSNR degraded as the fiber length increased, and the maximum achievable OSNRs were 30.2 dB and 20.6 dB for 240 km and 2160 km respectively. At 2160 km, the maximum achievable OSNR was more than 5 dB greater than the required value. Fig. 9(a) shows the required OSNR when the thermal noise contribution of the receiver was neglected, representing the maximum achievable performance. However, as discussed in Section 2, receiver thermal noise can significantly influence the performance of the FFD EDC schemes. Fig. 9(b) shows the required OSNR as a function of system length without thermal noise (circles) and with thermal noise for various values of key parameters (applied DC bias offset normalized to M, the average detected signal amplitude, and the AMZI DTD). The figure shows that thermal noise may limit the transmission distance to less than 240 km. By employing an AMZI with a larger DTD and biasing the detected intensity signal V + (t), the tolerance to thermal noise was significantly increased, which was attributed to the improvement in the V f (t) estimation.

Impacts of optical band-pass filter bandwidth
The final parameter to be optimized is the OBPF bandwidth. Fig. 10 shows the required OSNR versus the OBPF bandwidth at 6 dB and 12 dB ER for 960 km. V + (t) bias was 0.1M and the AMZI DTD was 30 ps. A 0.85 GHz high-pass EF was used for suppression of lowfrequency amplification. The figure shows that when the ASE noise was not sufficiently suppressed (bandwidth>0.25 nm), a system with 6 dB signal ER exhibited better performance compared to that with 12 dB ER. This matches recent experimental demonstration ) and is because a lower ER could effectively reduce noise amplification arising from the division by total received power and the low-frequency amplification in phase estimation. However, if the ASE noise was sufficiently suppressed (bandwidth<0.25 nm), the benefit of reducing the ER was reduced. Clearly, the optimal performance depended on a balance between mitigating the noise amplification and penalty induced by a lower ER. The optimal filter bandwidths for 6 dB and 12 dB ERs were 0.07 nm (~8.5 GHz).

FFD feed-forward equalizer and comparison to frequency-domain equalization
The design rules developed above are also applicable to FFD FFE and FFD MLSE. In this subsection, the performance of adaptive FFD FFE is numerically investigated and compared to the static frequency-domain equalization. The system parameters are set to the optimal values obtained in Section 4.1, specifically 12 dB ER, 8.5 GHz OBPF bandwidth, 30 ps AMZI DTD, and 0.1M V + (t) bias. Fiber nonlinearity and thermal noise are included. Fig. 10. Required OSNR versus bandwidth of the OBPF at 6 dB ER (triangles) and 12 dB ER (circles) for 960 km. Fiber nonlinearity and thermal noise are included. Fig. 11(a) shows the performance versus fiber length using frequency-domain equalization manually set to compensate 100% of the accumulated CD for each distance value (circles) and adaptive FFD FFE by initially setting the FFE coefficients to compensate 1080 km CD (triangles). Solid and dashed lines represent the cases using 2 and 5 samples/bit ADCs. In the figure, the performances of FFD FFE for five and two samples per bit were almost the same (<0.2 dB), so only the curve for the case of two samples per bit was plotted. The memory length of FFE was 32 bits (64 taps at 2 samples per bit). It is clearly seen that more than 5 dB OSNR penalty was observed at 2160 km for the frequency-domain equalization when using the reduced sampling rate. This penalty was due to not only the increased calculation inaccuracy during field reconstruction at a lower sampling rate but also the aliasing effect of the ADCs such that the distortion imposed by CD could not be fully compensated in the digital domain by a fixed filter with the exact inverse transfer function of the CD. In contrast, FFD FFE automatically searched the optimal condition to minimize the distortion even with the aliasing effect. Consequently, it exhibited more robustness to the reduction of the sampling rate. Note that due to the capability of compensating ISI regardless of the source, FFD FFE also mitigated other distortions, such as the distortion induced by the high-pass EF used in the phase estimation path. To illustrate this, Fig. 11(b) shows the required OSNR versus the bandwidth of the EF filter at 2160 km for frequencydomain equalization (circles) and adaptive FFD FFE (triangles). The figure shows that the performance of FFE was degraded when the system was dominated by the low-frequency amplification (<0.5 GHz). Consequently, a high-pass EF with sufficient bandwidth was required. However, a sufficiently wide filter bandwidth would result in distortion. When using frequency-domain equalization, the filter bandwidth should be carefully optimized to balance the low-frequency amplification and the distortion. In contrast, FFD FFE was robust to such distortion, resulting in improved performance and wider tolerance range.
In practice, unless precise clock recovery is performed, the common sampling phase of the two paths will drift throughout the eye. The misalignment of the sampling phase, t 0 , can be viewed as a filter with transfer function of exp(-jt 0 ), where t 0 is unknown and might slowly vary with time. Adaptive filters such as FFE can track and mitigate such distortion. By minimizing the mean square value of the decision error, the coefficients of FFD FFE may be self-adjusted to construct a transfer function equal to the multiplication of the inverse transfer function of CD, sampling phase misalignment, and the remaining ISI effects. Fig. 12(a) shows the simulated performance versus the sampling phase misalignment by using frequency-domain equalization (circles) and FFD FFE (triangles). FFD FFE has 2 samples/bit whilst the frequency-domain equalization employs 5 samples/bit. The figure shows that at 2160 km, FFD FFE exhibited negligible penalty for the sampling phase between [-50 ps 50 ps], which was much more robust when compared to the static frequency-domain equalization (circles). The adaptive speed of FFD FFE is illustrated in Fig. 12(b), which shows the performance as a function of the training sequence length. The initial FFE coefficients were set to compensate 1080 km CD. It can be found that the FFE coefficients converged rapidly from the initial values to the optimal values during the first 10,000 bit (corresponding to 1 s at 10 Gbit/s), and became steady thereafter. This suggests the potential of FFD FFE for applications in transparent optical networks where the reconfigurability of the add-and drop-nodes causes the transmission paths to vary rapidly.

FFD MLSE and comparison to DD MLSE, FFD FFE, and frequency-domain equalization
In Section 4.2, we verified that the well-known advantages of adaptive filters over fixed filters were clearly applicable to FFD EDC schemes, especially when the fixed filters were only designed to account for a restricted set of impairments. These advantages including improved tolerance to a wide high-pass filter bandwidth and the sampling phase misalignment also apply to MLSE. In this subsection, we will discuss the performance of FFD MLSE, and compare it with DD MLSE and other FFD-based EDC schemes, in particular, the adaptive FFD FFE. Fig. 13 shows the required OSNR as a function of fiber length using conventional DD MLSE (circles), FFD MLSE without a high-pass filter (triangles), and with a 1.25 GHz high-pass EF (squares). The full metric (Eq. (10)) was used and other system parameters were set to the optimal values as obtained in Section 4.1. The fiber nonlinearity and thermal noise were included. It was found that the FFD MLSE, without proper suppression of low-frequency amplification, performed worse than conventional DD MLSE (Bosco & Poggiolini, 2006; regardless of the memory length m. However, by optimizing the low-frequency response for the estimated frequency (squares), performance was significantly improved. This strongly suggests that systems based on FFD MLSE can offer greater reach than DD MLSE for both 4 and 16 states implementations. At a OSNR of 15 dB, the CD tolerance was enhanced from 270 km to 420 km, and from 400 km to 580 km for m of 2 and 4 respectively, representing approximately 50% performance improvement. More importantly, for optical networks with fixed transmission reach, FFD can greatly reduce the MLSE complexity when compared to DD, e.g. from 16 states to 4 states to achieve 400km. This improvement over DD MLSE has been experimentally verified recently , where 4-and 16state FFD MLSE was demonstrated to support 372 and 496 km BT Ireland's field-installed SMF respectively.  As discussed in Section 3, the full metric (Eq. (10)) can be approximated by computing the two marginal probabilities instead of a single joint probability. Fig. 14 compares the required OSNR obtained using this reduced metric (11) (circles) and the full metric (10) (triangles) when the memory length m is (a) 2 and (b) 4. The figures clearly show that the high-pass filter was critical for the optimum operation of FFD MLSE using metric (11). This is because when the system was dominated by low-frequency amplification, a correlation in the noise statistics of the extracted real and imaginary components might be expected, so breaking the assumption leading to metric (11). In contrast, optimization of the low-frequency response enabled metric (11) to exhibit similar compensation performance to that using metric (10), with the advantage of a significant reduction in the complexity.
Having established that FFD MLSE based on the reduced metric (Eq. (11)) outperforms DD MLSE provided that an appropriate high-pass filter is employed in the phase estimation path, Fig. 15(a) compares the performance of two FFD-based adaptive compensation schemes, MLSE and FFE, using the optimized system parameters as discussed in Section 4.1.
It is clearly seen that 16-state (m=4) FFD MLSE exhibited better performance than FFD FFE with the same memory length, but had less compensation distance when compared to FFD FFE with increased memory lengths of m=8 and 16. Increasing the memory length of FFD MLSE can overcome this limitation but would increase the complexity exponentially, hindering its applications for longer-distance transmissions. However, for DCF free metro networks with distance around several hundred kilometers, FFD MLSE is a more effective approach. The reason is threefold. Firstly, it requires low implementation complexity for small m values (4), which is achievable by modern microelectronic technologies. Secondly, it exhibits better performance limit than that of FFD FFE with the same m. Finally, FFD MLSE has much better tolerance to the noise and the associated noise amplification mechanisms in full-field reconstruction. Fig. 15(b) shows that when the system parameters were not fully optimized, the performance of FFD FFE was degraded severely, and was poorer than that of 16-state FFD MLSE even when m increased to 16. The curve using frequency-domain equalization (dotted line) was also depicted and exhibited the worst performance. Consequently, stringent limit on the design of system parameters should be placed on FFD FFE and frequency-domain equalization, but it can be greatly relaxed by FFD MLSE. This conclusion matches recent experimental demonstration .

Discussion
Frequency-domain equalization is simple and cost-effective, but requires prior information of the dispersion experienced during the transmission. Although adaptation algorithms have been proposed for coefficient adaptation, this technique usually requires serial-toparallel conversion into blocks for (inverse) Fourier transform (see Fig. 5), which would reduce the adaptation capability. On the other hand, FFD FFE improves the adaptation capability and can also equalize other linear impairments in addition to CD. Its complexity is approximately linearly proportional to the transmission distance and, for long-distance applications, is higher than the frequency-domain equalization method. Finally, the complexity of FFD MLSE increases exponentially with the transmission distance, hindering its applications for long-distance transmissions. However, for DCF free metro networks with transmission reach <500 km, FFD MLSE is a more effective approach. For long-distance applications (>500 km), the combination of the static frequency-domain equalization and adaptive FFD MLSE based on parametric channel estimation can well balance the complexity, performance, and adaptation speed, and will be investigated in the next section.

10 Gbit/s OOK experiment for 0-900 km adaptive transmission
In this section, we experimentally demonstrate 10 Gbit/s OOK adaptive transmission for a wide range of distances from 0 to 900 km. The combination of static frequency-domain equalization and adaptive FFD MLSE with parametric channel estimation (Eq. (12)) was used in the experiment to balance the performance, complexity, and adaptation speed ). Fig. 16 shows the experimental setup. A 1550 nm signal from a distributed feedback laser was intensity modulated using a MZM giving a 6 dB ER signal at 10 Gbit/s with 2 15 -1 PRBS data. The OOK signal was transmitted over a re-circulating loop comprising 60 km of SMF with a signal launch power of -2.5 dBm per span. A 1 nm OBPF was used in the loop to suppress the ASE noise. At the receiver, the signal was detected with an optically pre-amplified receiver and a VOA was used to vary the input power to the EDFA. The preamplifier was followed by an OBPF with a 3 dB bandwidth of 0.3 nm, a second EDFA, and another OBPF with a 3 dB bandwidth of 0.8 nm. Then the optical signal was passed through an Kylia AMZI with 40 ps DTD and /2 differential phase shift. The two outputs of the AMZI were detected by two 10 Gbit/s receivers. Both detected signals were simultaneously sampled by a real-time oscilloscope at 25 GSamples/s with 8-bit resolution. In off-line processing, an automatic algorithm was used to temporally align the signals from these two receiver chains, locate the position of the training sequence, and re-sample the signals. Note that due to the use of MLSE, the sampling phase was not strictly required to be at the eye centre. The received sequence was serial-to-parallel (S/P) converted to blocks with block size of 256 bits and 8-bit overlap between adjacent blocks for guard interval. Frequency-domain equalization was implemented based on block processing using (inverse) fast Fourier transform. The following FFD MLSE had 16 states and 2 samples/bit and used Gaussian based channel training (see Eq. (12)). 432,000 signal bits were processed.  Fig. 17(a) shows BER versus OSNR for 0, 480, 720, and 900 km. In this figure, the parameters of frequency-domain equalization were set to approximately fully compensate the CD, and the training time for the MLSE was 1 s. Fig. 17(b) depicts the recovered eye diagrams after frequency-domain equalization for 900 km. The figure shows that the system operated well after 480 km and 720 km, with 3 dB and 4 dB OSNR penalty at BER of 10 -3 , respectively. At 900 km, the slope was reduced due to non-ideally suppressed noise amplification. However, the best achievable BER was 1.510 -4 , well below the forward error correction limit.  The dotted line represents BER of 110 -3 , used as the forward error correction limit. Fig. 17 is based on the assumption that exact prior information of the fiber length has been obtained. In practice, this value may not be known and can also vary frequently over a wide range. Fig. 18(a) shows the performance when the frequency-domain equalization was preset to be a fixed value and MLSE was used to adaptively trim the impairments for various transmission distances. The training time of the MLSE was 1 s and the received optical power into the pre-amplifier was -28 dBm. Note that the received OSNR was different for different transmission distances, with the case of 900 km exhibiting the worst OSNR of 23 dB. The figure shows that a BER better than 10 -3 could be achieved for any measured distance up to 900 km when the pre-set value was between 500 km and 575 km. For the pre-set value beyond 575 km, the performance for short distances (<150 km) would be degraded due to the finite MLSE compensation window (~550 km at 16 states as shown Fig. 13(b)). This figure also implies that the system was insensitive to the exact pre-set dispersion value, so a coarse estimation was sufficient. To illustrate the adaptation speed of the system, Fig. 18(b) shows the BER versus the training time for three different distances when the frequency-domain equalization was pre-set to compensate 550 km CD. The figure shows that the performance converged rapidly during the first 200 ns for all distances. After 400 ns, the BER fell below 10 -3 even for the longest distance, demonstrating the potential of FFD EDC in frequently configured optical networks.

Full-field detection for 40 Gbit/s offset DQPSK
In addition to amplitude-modulated OOK format, FFD can also be used in phase-modulated formats, which have been widely employed for 40 Gbit/s and beyond. In conventional differential quadrature phase shifted keying (DQPSK) system, at least two AMZIs and two pairs of balanced photodiodes are required for incoherent detection Liu & Wei, 2007). Furthermore, the near-zero intensity during a  phase shift between symbols limits the system performance unless complicated pre-distortion is used (Kikuchi & Sasaki, 2010). On the other hand, offset DQPSK format has been proposed in optical communications (Wree et al., 2004) to eliminate the near-zero intensity between symbols and this format exhibits the same spectral efficiency as conventional DQPSK. However, conventional offset DQPSK system has degraded receiver sensitivity and CD tolerance (Wree et al., 2004), which hinders its use for practical applications. In this section, we show that FFD based EDC can significantly improve the performance of the offset DQPSK system ). The presented system uses a simpler pre-coder at the transmitter, only one AMZI, and one pair of photodiodes at the receiver, reducing the implementation cost when compared to conventional DQPSK. Consequently, it is promising for cost-sensitive 40 Gbit/s Ethernet or short metro networks. Fig. 19. Configuration of FFD-based offset DQPSK Fig. 19 illustrates the configuration of FFD-based offset DQPSK. The transmitted data, a k , is demultiplexed into the in-phase and quadrature tributaries, which are differentially encoded using exclusive OR (XOR) individually. Note that this pre-coder uses only two XOR gates and is much simpler than the conventional DQPSK pre-coder which typically requires the combination of >20 XOR, AND and NOT logic gates. The encoded quadrature signal is delayed by T/2 with respect to the in-phase signal (Inset in Fig. 19), where T is the symbol period. Consequently, the phase may possibly change every T/2, but each phase change can only be 0, /2. In addition, the possible zero intensity between symbols induced by instantaneous  phase shift in conventional DQPSK is eliminated. At the receiver, the optical front end and full-field reconstruction for offset-DQPSK are the same as those in the OOK format. However, an additional electrical-domain differential detection process is employed before the dispersion compensation stage, as depicted in Fig. 20. The performance of differential detection can be improved by exploiting the field differences between a symbol and its previous (L-1) symbols, where L>1, resulting in better field reference. This method is conventionally implemented in the optical domain by using (L-1) (or 2(L-1)) AMZIs (Zhao & Chen, 2007). However, this implementation is complicated. By using FFD, multiple differential fields can be obtained in the electrical domain simply using delays and multiplications while only one optical AMZI is employed. In offset DQPSK, the phase may change every T/2, so two samples per symbol (or one sample per bit) are used. The multiple differential fields for the n th bit, I i (t n ), may be estimated by conj(V full (t ni ))V full (t n ), where i (=1,...,L-1) denotes the i th branch of the differential field detection and conj() represents the conjugate. These differential samples are then fed into the MLSE. The metric of MLSE, PM(a n ), used by the Viterbi algorithm to estimate the most likely transmitted data sequence, is given by: where (I i (t n )) (or (I i (t n ))) represents the real (or imaginary) part of the differential field I i (t n ). p((I i (t n )), (I i (t n ))a n-m ,…,a n ) is the joint probability of the differential field given the transmitted data a k-m ,…,a k . m is the memory length. Eq. (13) shows that the size of the required lookup table for channel estimation and the complexity of metric computation scale approximately linearly with L. On the other hand, Viterbi decoding is independent of L and is the same as that in conventional MLSE. Fig. 20. Multiple-reference based differential detection and MLSE. D, 2D, and 3D represent one-, two-, and three-sample delay respectively.
Simulation implemented in Matlab was performed to verify the operating principle of this scheme. The analysis model was the same as Fig. 19. Two uncorrelated 20 Gbit/s data trains using 2 11 -1 pseudo-random binary sequence repeated nine times were differentially encoded individually. Each encoded data train generated an analogue electrical signal using raisedcosine shaped pulse with a roll-off coefficient of 0.4 and 40 samples per symbol. The response of the driving amplifier was 5 th -order Bessel shaped with 20 GHz 3 dB bandwidth. The electrical signals were used to modulate a continuous wave light from a laser with 100 kHz linewidth. A piece of fiber with CD of 16 ps/km/nm was used to investigate the CD tolerance. At the receiver, the launch power into the preamplifier was adjusted to control the www.intechopen.com

96
OSNR. The preamplifier was followed by an OBPF with optimized bandwidth. The AMZI had a differential phase shift of /2 and 10 ps DTD, unless otherwise stated. The signal power into the photodiodes was 3 dBm and the noise spectral power density of the photodiodes was 20 pA/Hz 1/2 . After detection, the signals were amplified, filtered by a 30 GHz 4 th -order Bessel EF, and processed as described above. MLSE had two samples per symbol, 5-bit resolution, and 16 states (considering two (or four) adjacent symbols (or bits)). The number of differential measurements used for metric computation, L, was varied from two to four. The simulation was iterated ten times with different random number seeds to give a total of 184,230 simulated symbols. The performance was evaluated using the required OSNR to achieve a BER of 110 -3 by direct error counting. OSNR penalty versus the AMZI DTD using 16-state MLSE and L=4. The OSNR penalty is defined as the penalty with respect to the OSNR value using optimized AMZI DTD. Fig. 21(a) shows the performance of the offset DQPSK with and without MLSE. The OBPF bandwidth was optimized at the back-to-back case and the optimal value when using MLSE (16.5 GHz) was smaller than that without MLSE (23.5 GHz). In common with other MLSE investigation, this was due to the capability of MLSE to compensate filtering-induced ISI such that a narrow OBPF bandwidth could be used to mitigate the impact of the noise and the CD. The figure clearly depicts the benefit of MLSE with a larger number of differential measurements L. When using 16-state MLSE and L=4, a transmission distance of around 50km could be supported for a required OSNR of 18dB (100km total dispersion tolerance range). Fig.  21(b) illustrates the low sensitivity of the system to the precise AMZI delay. Smaller DTDs gave more precise estimation of V f (t) and V p (t), and consequently resulted in reduced OSNR penalties. At 40 Gbit/s, less than 1 dB penalty was induced for an AMZI with DTD between 2.5 ps and 15 ps for both back-to-back and 30 km. Note that the DTD could not be reduced indefinitely due to the increased limit induced by thermal noise as discussed in Section 4.1.

Conclusions
FFD EDC, by surpassing the limited performance of current DD EDC products (300 km at 10 Gbit/s) and avoiding the high implementation cost of coherent detection EDC (for long-haul systems), is of particular value for applications in DCF-free transparent access/metro networks and Ethernet. For 10 Gbit/s metro networks with transmission reach of 300-500 km, FFD MLSE is an effective approach and can exhibit 50% performance improvement when compared to DD MLSE, or exponentially reduce the required state number for a fixed transmission reach. It is also more robust to non-optimized system parameters than fullfield detection based frequency-domain equalization and FFE, and thus relaxes the system specifications. For transmission reaches longer than 500 km, the combination of costeffective and static frequency-domain equalization and adaptive FFD MLSE with parametric channel estimation can obtain a balance of performance, complexity, and adaptation speed. 0-900 km adaptive transmission with less than 400ns adaptation time is achievable at 10 Gbit/s. For higher bit rate systems, FFD based offset DQPSK offers a cost-effective solution for 40 Gbit/s Ethernet or short metro networks, and when compared to conventional DQPSK with the same spectral efficiency, it uses a simpler pre-coder at the transmitter, only one AMZI and one pair of photodiodes at the receiver, while supporting 50 km SMF transmission without optical compensation at 40 Gbit/s.