Open access peer-reviewed chapter

Machine Learning Techniques to Mitigate Nonlinear Phase Noise in Moderate Baud Rate Optical Communication Systems

By Eduardo Avendaño Fernández, Ana María Cárdenas Soto, Neil Guerrero Gonzalez, Giovanni Serafino, Paolo Ghelfi and Antonella Bogoni

Submitted: April 21st 2019Reviewed: July 28th 2019Published: December 13th 2019

DOI: 10.5772/intechopen.88871

Downloaded: 338


Nonlinear phase noise (NLPN) is the most common impairment that degrades the performance of radio-over-fiber networks. The effect of NLPN in the constellation diagram consists of a shape distortion of symbols that increases the symbol error rate due to symbol overlapping when using a conventional demodulation grid. Symbol shape characterization was obtained experimentally at a moderate baud rate (250 MBd) for constellations impaired by phase noise due to a mismatch between the optical carrier and the transmitted radio frequency signal. Machine learning algorithms have become a powerful tool to perform monitoring and to identify and mitigate distortions introduced in both the electrical and optical domains. Clustering-based demodulation assisted with Voronoi contours enables the definition of non-Gaussian boundaries to provide flexible demodulation of 16-QAM and 4+12 PSK modulation formats. Phase-offset and in-phase and quadrature imbalance may be detected on the received constellation and compensated by applying thresholding boundaries obtained from impairment characterization through statistical analysis. Experimental results show increased tolerance to the optical signal-to-noise ratio (OSNR) obtained from clustering methods based on k-means and fuzzy c-means Gustafson-Kessel algorithms. Improvements of 3.2 dB for 16-QAM, and 1.4 dB for 4+12 PSK in the OSNR scale as a function of the bit error rate are obtained without requiring additional compensation algorithms.


  • nonlinear phase noise
  • clustering
  • Voronoi
  • decision boundary

1. Introduction

Novel transceiver architectures for the 5th Generation (5G) communication networks front- and backhaul must make use of the digital signal processing (DSP)-assisted transmission capabilities to optimize the trade-off among signal processing complexity, spectral efficiency and reach in the short access radio networks [1]. One key element in hybrid optical-wireless next-generation technologies is the capability to identifying the noise and distortions, their impact on the signal, and how the modulation format is adapted to that channel quality conditions [2]. For this reason, an important disadvantage of analog Radio-over-Fiber (A-RoF) systems using high-order modulation formats is its inherent propensity to transmission impairments risen from the electrical and optical domain. The immediate effect on the signal is a limitation in the dynamic range (DR) [3], in terms of reach as a function of the bit rate. In the radiofrequency (RF) communications context, oscillators are key devices of transmission chain whose main purpose is to deliver stable reference signals to perform frequency translation from baseband to passband and to provide in-phase (I)/quadrature (Q) components synchronization. As inherent in electronic devices, practical RF oscillators exhibit nonideal behavior and characteristics leading to random phase noise (PN) on the received signal. The spectral regrowth at the output of a nonlinear amplifier and IQ imbalance, combined with the effect of PN, are the major RF impairments that degrade the performance of communication systems in the electrical domain [4]. PN arises mainly by inherent passive and active components, which introduce thermal noise inside the circuitry of oscillators; therefore, due to transistor transient state’s, aspects, as a rule of thumb, the higher the frequency of an oscillator, the worst is its performance in terms of PN. For coherent systems, the impact of PN over data symbols may be observed in the constellation diagram, where the main effect is a random rotation of symbols, causing their distribution over rings with elongated shapes. Because of this shape distortion, some immediate effects are (i) increased detection of erroneous symbols, (ii) spectral broadening, which can lead to interusers interference, (iii) in-band signal injection of out-of-band emission, and (iv) loss of orthogonality among multi-carrier waveforms. These effects over signal imply tackling synchronization issues leveraging on dedicated blocks of the DSP chain at the receiver [5]. The main task of the synchronization stage is to match the frequency and phase of transmitter and receiver oscillators, in both the electrical and optical domains. This process includes time recovery and tracking of the local oscillator (LO) frequency mismatch, to compensate for impairments caused by PN.

In this scenario, a critical block in the DSP chain of optical communication systems is the carrier phase recovery (CPR) block, which is required before the symbol decision takes place. It has the role of compensating the PN, directly related to the frequency noise of the lasers employed as optical carriers in the system. The noise of lasers consist of short-term random fluctuations of the optical phase and other different output parameters. Besides, the baud rate, the transmission length, and, in general, the DR of optical communication systems are limited by noise issues. The key parameter related to the noise in lasers is the linewidth, defined for a monochromatic source as the width of its optical spectrum, i.e., the power spectral density (PSD) of the emitted electrical field, oscillating at optical frequencies, as a function of frequency, wavenumber, or wavelength. An ideal electronic or optical oscillator would be, strictly speaking, monochromatic, exhibiting a zero linewidth. In other words, all the energy it emits is completely concentrated on its nominal frequency. Real oscillators are not exactly monochromatic and have a finite, non-zero linewidth. Besides, before the first laser was experimentally demonstrated, Schawlow and Townes showed that a laser linewidth could not have values lower than a fundamental quantum limit [6]. Nowadays, the most widely employed lasers in optical communications are semiconductor lasers; Henry demonstrated that this kind of lasers exhibits a linewidth enhancement factor [7], concerning to the Schawlow-Townes quantum limit. The line broadening is caused by the coupling of phase and amplitude noise, due to the variations of carrier density (i.e., of the refractive index) in the laser cavity. Further analysis of these phenomena is out of the scope of this chapter, and the interested reader is referred to [6] and [7]. There exists a strong relationship between the laser linewidth (LW) and the temporal coherence, characterized by the coherence time or coherence length, i.e., the measure of temporal coherence expressed as the time over which the field correlation decays. The linewidth can be used for estimating the coherence time as a function of the optical spectral shape, and the bandwidth can be approximated just by the inverse of the coherence time for typical spectral shapes, such as the Lorentzian or the Gaussian shape. In this chapter, the proposed methods mitigate the PN introduced by the mismatch between the optical carrier and the transmitted RF signals that are not correlated within the coherence time [8] of the local oscillator for the specific baud rate (250 MBd) used in the experimental setup.

Due to electrical-to-optical conversion, optical fiber propagation, and optical-to-electrical conversion, the signal is distorted and impaired due to the impact of system devices in the conversion stages, combined with distortive effects in the optical fiber segment, such as chromatic dispersion (CD), the Kerr effect, and, to a lesser extent, polarization mode dispersion (PMD). The experimental setup was conceived to realize a Radio-over-Fiber (RoF) system with the particular characteristics explained above, including severe impairments in both the optical and the electrical domains. A typical issue that must be compensated for is the frequency mismatch between the transmitter frequency carrier and the LO used for down-conversion at the receiver, either direct or coherent detection. This mismatch is due to PN and produces fluctuations in the constellation point locations concerning to their ideal positions, causing the outer symbols to shift away from their natural position, leading to a broadening of constellation data symbol. In [9], the received constellation exhibits data symbol (or clusters) with quasi-elliptical (elongated) shape in the case of modulation formats with 16 data symbol points. Some effects were observed on cluster symbols like the imbalance gain, direct current (DC) offsets, time skews, and PN, creating an irregular distribution in data symbols, resulting in overlaps among adjacent neighbor clusters when the conventional rectangular grid is used as demodulation rule. This same distortion appeared in our experimental scenario, corresponding to 16-Quadrature-Amplitude Modulation (QAM) and 4 + 12 Phase Shift Keying (PSK) constellations. To overcome PN impairments, well-known post-equalization techniques have been proposed, such as (i) detection methods based on the symbol error probability (SEP) in which minimum Euclidean distance (ED) decisions are used to define decision regions (for example, the rectangular grid for 16-QAM); (ii) demodulation methods based on maximum likelihood criterion applied for decision-making rules under conditions of uncertainty [10]; (iii) machine learning techniques applied to clusters, to classify and demodulate data symbol points, and mainly to improve the error system tolerance against nonlinear effects [11, 12].

On the other hand, a novel method for clustering was applied using Voronoi diagrams [13], inspiring the proposal of generic methods that can be extended to coherent receivers, avoiding the carrier frequency offset (CFO) and carrier phase offset (CPO) equalization algorithms at DSP stage [14]. Besides, in optical systems using coherent receivers, it is observed that the received constellation, even after applying DSP algorithms, remains influenced by enhanced noise originated from the PN added by the system components, and frequency noise from the LO [15]. In [16], a guideline for a joint design using forward error correction (FEC) and coded modulation (CM) techniques show the trade-off among electronic computational complexity, optical transparent reach, and hardware constraints. CM schemes with signal shaping and rate adaptation capabilities are combined to adapt the optical data transmission architecture under different signal qualities.


2. Nonlinear phase noise (NLPN)

As demonstrated by Gordon and Mollenhauer [17], the linear phase noise associated with ASE noise from inline amplifiers interacts with the transmitted signals via the Kerr effect, generating a strong distortion over symbols, referred to as NLPN. In an optical transmission system, amplification is an essential operation to compensate for signal propagation losses. Commonly, erbium-doped fiber amplifiers (EDFAs) are employed as optical boosters in the transmitters, as well as in-line optical amplifiers and pre-amplifiers in the receivers. Their benefit in preserving the signal level among the receiver sensitivity comes with the undesired amplified spontaneous emission (ASE) noise. The main effect of NLPN consists of creating a nonlinear distortion over the shape of external symbols, which degrades the symbol error performance. To compensate for this effect, the demodulation process requires non-Gaussian decision boundaries that, in practice are not simple to realize.

In the considered scenario for the validation of the proposed method, NLPN was observed in both the 16-QAM and 4 + 12 PSK constellation diagrams. In our case, the NLPN effects are very strong, due to the low adopted baud rate, which translates to a long symbol duration, that must be compared to the coherence time tcbetween the local oscillator and the incoming signal from the transmitter (ideally, a baud rate of 250 MBd needs a tcaround 4 ns). Usually, this effect on symbols is due to higher linewidths of the laser [18], but, in our case, the linewidth used was 1 kHz. In Table 1 , a state-of-the-art synthesis and review of different methods to compensate impairments using the conventional equalization techniques through DSP techniques and the design and optimization of constellations tolerant to noises and distortions is introduced. This table presents columns with a brief explanation of the proposed method showing the metrics used to evaluate the performance, the modulation scheme, and the authors and year of publication.

MethodDescriptionMod. formatAuthors and year
Digital signal processing and equalization algorithmsDigital backpropagation (DBP) with coherent detection are proposed to mitigate dispersion effects and nonlinear impairments through noniterative asymmetric split-step Fourier method (SSFM). An optimal decision map and a heuristic for the step size and sampling rate requirements for RZ-QPSK were obtained by numerical simulation.RZ-QPSKEzra and Kahn [19]
PN tolerance of circular multilevel quadrature amplitude modulation (C-mQAM) constellations employing different carrier phase recovery (CPR) algorithms is studied and evaluated using a differential decoding and bit-mapping scheme. Carrier phase recovery achieves similar performance than the blind phase search (BPS) algorithm at lower computational complexity level, obtaining 3.8e-3 and 1e-2 BER levels for C-16QAM and C-64QAM modulation formats, respectively.16-QAM
Rodríguez Navarro et al. [20]
Laser PN and NLPN limit and degrade performance at high launch power level, the phase-modulated coherent optical communication systems. Authors propose the use of a Kalman filter to estimate, track, and mitigate both impairments in decision-directed mode filtering.QPSKJain and Krishnamurthy [21]
Two-stage extended Kalman filtering (EKF) technique for the joint tracking of frequency offset (FO), laser PN, fiber nonlinearity, and amplitude noise is numerically evaluated for polarization multiplexed 16-QAM at 28 GBd. Results show tolerance over SSMF in a span length of 3000 km, obtaining a BER of 2.4e-2.16-QAMPakala and Schmauss [22]
An effective k-nearest neighbor (kNN) detector to further improve the performance of the optical phase conjugate (OPC) system by mitigating non-Gaussian impairments caused by NLPN is numerically evaluated.QPSK, 16-QAMJiang et al. [23]
Constellation design and optimizationThe authors propose an optimal signal constellation design (OSCD) algorithm suitable for the phase noise channels using the cumulative log-likelihood (LLR) function as the optimization criterion.DPSKLiu et al. [24]
Joint coding and modulation format optimization using bit-interleaved coded modulation (BICM) mutual information shows that circular constellation 8-QAM is suitable for high-coding rates while rectangular 8-QAM provides the best performance for low-coding rates, and both schemes outperform start 8-QAM.8-QAM
Circular and star 8-QAM
Ríos Müller et al. [25]
Prototype nonconventional constellations generated using digital pre-distortion and demodulated at the receiver using clustering were validated experimentally. ONSR improvements of 1.2 dB for nonconventional ring 6-2-8 constellation and up to 2.9 dB using the same constellation demodulated with fuzzy c-means Gustafson-Kessel was obtained comparing to 16-QAM.16-QAM, 4 + 12 PSKFernández et al. [26]
A new geometric shaping method is proposed, leveraging autoencoder unsupervised machine learning (ML) to optimize the constellation design. The learned constellation mitigates nonlinear effects with gains up to 0.13 bit/4D when trained with a simplified fiber channel model.64-QAM
Jones et al. [27]

Table 1.

Nonlinear fiber impairment mitigation techniques.

For multilevel advanced modulation formats, especially for m-ary phase-shift keying (m-PSK) and m-ary quadrature amplitude modulation (m-QAM), the symbols in the constellation diagram suffer from linear amplitude noise and PN. In an ideal oscillator, the phase variance over a given time interval remains small, and the output signal is nearly periodic. However, as already mentioned, in practical oscillators, the phase random variation or phase jitter is a random variable associated with the PN random process, being the instantaneous deviation of phase from the ideal value caused by PN [28]. Either in homodyne or heterodyne detection, the beating of the received signal with a LO with the same frequency of the optical carrier enables the detection and demodulation process; however, since the phase is not stable in time, frequency will not be the same between LO and carrier, causing a lack of synchronization that translates into the degradation of the received signal. The constellation symbol will not be in its ideal place (for example, in QAM the ideal location of a symbol point i.e., 1 + i), and it will be rotated by an angle given by the phase difference between the phasor of the transmitted signal (i.e., symbol) and the phasor of the LO. If this phase difference is large, the symbols fall outside the right decision region, tracing an elongated shape in the constellation, and the symbol error performance is degraded. In long-haul scenarios, the ASE noise from inline high-gain amplifiers becomes the main source of the noise [29]. The NLPN can be compensated electronically by subtracting from the received phase a correction that is proportional to the received intensity. The optimal scaling factor is approximately equal to half of the ratio of the mean NLPN and the mean received intensity [30]. Thus, the phase jitter, i.e. the standard deviation of residual phase noise is halved, and the transmission distance is doubled. In [31], the minimization of the total noise variance is performed by using a power profile along with the link that is linear with distance.

3. Machine learning (ML) techniques

The main challenge of ML algorithms consists of mitigating nonlinearities, where several performance parameters and error estimation techniques have been applied to linear and nonlinear impairments (chromatic dispersion, polarization mode dispersion, Kerr effect, etc.) that distort the shape of symbol points in the constellation diagram. Under adverse conditions during propagation, complex highly unpredictable analytical models should be better implemented with flexible techniques that take advantage of feature extraction and exploit the knowledge of historical data to create direct input–output relations between monitored parameters and desired outputs. The immediate effect of these ML methods is to decrease the error rate by enhancing the distance transmission or increasing the optical signal to noise ratio (OSNR) tolerance against these linear and nonlinear effects. An additional key issue is that ML in optical communication systems can work where is not possible to obtain a closed-form expression that model the optical channel [32]. Parameter estimation or system monitoring must be performed before signal demodulation, but is not a trivial task because of linear and nonlinear transmission impairments, especially when the signal crosses through different channels like the hybrid optical-wireless considered here (the Radio-over-Fiber system) has not a defined mathematical model including channel condition variations and transient impairment contributions during operation. Therefore, it is appropriate to use versatile methods that learn from data to predict i.e., the conditional probability or maximum likelihood function or to use the classical compensation method where to have a constrained channel model is possible to apply training sequences to estimate and equalize the channel response and improve the system performance. In this sense, to apply different machine learning approaches is attractive to characterize, to train and learn a mapping function between the signal features in the optical-wireless scenario, to and define non-Gaussian boundaries in constellation diagram before signal demodulation.

Machine learning has been successfully applied to a wide variety of problems in the context of telecommunication networks, mainly in wireless networks, ranging from opportunistic spectrum access, to channel estimation and signal detection in orthogonal-frequency division multiplexing (OFDM) to multiple-input-multiple-output (MIMO) communications. Many techniques within the previous categories can be addressed at the physical layer of optical transmission for optical performance monitoring (OPM) [33, 34, 35, 36, 37, 38, 39, 40], modulation format recognition (MFR) [41], or nonlinearity mitigation. Ideally, OPM should be implemented just with a single photodiode and machine learning algorithms that can learn the mapping between the detected signal and optical fiber channel parameters and finally predict optical fiber channel parameters from energy constellation diagrams or power eye diagrams. Moreover, machine learning algorithms enable the MFR [39] of an incoming signal to perform accurate constellation demodulation, also, to apply signal processing and detection at the receiver.

In Table 2 , novel architectures and techniques under the artificial intelligence umbrella are reviewed, mainly from perspective of optical communication systems to mitigate nonlinear effects and perform signal demodulation. Machine learning techniques based on clustering or support vector machine (SVM) may assist the definition of nonlinear boundaries for demodulation, avoiding additional stages for phase estimation and correction, extraction of features from constellations of real transmission scenarios, demonstrating the potential of artificial intelligence algorithms in next-generation optical-wireless technologies. In [42], a description of the mathematical background of machine learning techniques from a signal processing perspective applied to nonlinear transmission systems, OPM, and cross-layer network optimizations for software-defined networks (SDN) is reviewed as an alternative, unique, and powerful set of tools to conventional compensation techniques in the wireless-optical fiber context. In this chapter, clustering techniques like k-means and fuzzy c-means are used to classify data symbols of the classical 16-QAM constellation and 4 + 12 PSK modulation formats. Here, we report on an experimental RoF testbed to demonstrate the impact of techniques based on statistical analysis and advanced clustering methods for the mitigation of NPLN-induced impairments. The definition of such RoF testbed includes a clustering-based detector that allows identifying the distortive effects over symbols in the constellation diagram of both channels, the electrical and the optical fiber channel that, together, introduce particular distortions including NLPN. The processing that applies Voronoi contours before signal demodulation, defines non-Gaussian boundaries and increases noise tolerance improving the system error performance.

MethodDescriptionMod. formatAuthors year
ClusteringBurst optical signal detection based on a modified k-means clustering algorithm is used to establish a threshold, and the effects of the unbalanced ratio of bits zero and one to improve detection performance in terms of SNR without preamble field for amplitude recovery. In addition, a data-aided feedforward symbol-timing recovery method based on a polynomial interpolation and maximum-likelihood estimation theory are developed.N.A.Zhao et al. [43]
Experimental demonstration of successful coherent detection and digital demodulation for 3 × WDM of 2.5 Gbit/s QPSK/8-PSK phase-modulated RoF optical link after transmission over a 78.8 and 40 km of field-deployed fiber, respectively. The iterative clustering approach based on the k-means algorithm was applied to perform phase recovery on QPSK signal successfully.QPSK, 8-PSKGonzález et al. [37, 44]
A novel phase offset compensation by dual-stage ring fragmentation and clustering for 16-QAM and 4 + 12 APSK constellation is demonstrated experimentally in RoF scenario. OSNR improvements up to 3.3 and 1.4 dB, for 16-QAM and 4 + 12 PSK, respectively, were obtained.16-QAM, 4 + 12 PSKFernández et al. [9]
Unsupervised k-means and Gaussian mixture model (GNN) algorithms are proposed in different sections of coherent DSP to perform pre-decisions and enhance the performance of multi-modulus (MMA) algorithm and carrier phase recovery (CPE). Experiments demonstrate more than 3 and 1 dB improvements in OSNR sensitivities for dual-polarization 64-QAM at BER thresholds of 4.5e-3 and 1.6e-2, respectively.64-QAMXu et al. [45]
A k-means clustering algorithm is proposed and experimentally demonstrated in visible light communication (VLC) systems to mitigate nonlinear distortion effects. Improvements up to 5.5 dB in Q-factor are obtained for 400 Mb/s Nyquist 4-PAM signal at BER threshold of 2.4e-2.4-PAMMa et al. [46]
Support vector machine (SVM)A SVM-based detector is proposed for coherent optical fiber amplitude phase shift keying (APSK) constellation to mitigate NLPN. Comparing with 16-QAM, the SVM detector increases the nonlinear system tolerance by 4.88 dB at a BER of 1e-3.16-APSKHan et al. [11]
A machine learning-based classifier, namely SVM, is introduced to create the nonlinear decision boundary in M-ary PSK-based coherent optical system to mitigate NLPN. The maximum transmission distance and launch power dynamic range (LPRD) tolerance are improved by 480 km and 3.3 dBm for 8-PSK.8-PSKWang et al. [35]
A proposal of a bit-based SVM as a non-parameter nonlinear mitigation approach in the millimeter-wave Radio-over-Fiber (mm-RoF) system for different modulation formats is demonstrated. Experimental results outperform the k-means algorithm and show improvements of 1.2 dB for 16-QAM, 1.3 dB for 64-QAM, 1.8 dB for 16-APSK, and 1.3 dB for 32-APSK at BER of 1e-3 with the SVM detector, respectively.QPSK, 16-QAMCui et al. [47]
Convolutional neural network
An intelligent constellation diagram analyzer is proposed to implement both modulation format recognition (MFR) and OSNR estimation by using a CNN-based deep learning technique. The experimental results showed that the OSNR estimation errors for all the signals were less than 0.7 dB and the accuracy of MFR was 100%, proving the feasibility of the proposed scheme.m-PSK
Wang et al. [48]
Maximization of capacity over deployed links require operation regime estimation based on precise understanding of transmission conditions through linear and nonlinear SNR from the received signal. The extraction of NLPN and second-order statistical moments by a neural network is trained to estimate SNR from extensive realistic fiber transmissions. Measured performance of 0.04 and 0.2 dB of standard error for the linear and nonlinear SNRs, respectively, are obtained.16-QAM
Vaquero Caballero [49]
Supervised machine learningA machine learning detector based on kNNalgorithm overcomes non-Gaussian symmetric noise coming from laser PN and NLPN in the zero-dispersion link and dispersion-managed link (DML). The method increases the linewidth tolerance by 180 kHz, and the nonlinear tolerance for dispersion neglected, managed, and unmanaged links by 1.7, 1 and 1.4 dBm, respectively.16-QAMWang et al. [12]
A nondata (ND)-aided k-nearest neighbor (kNN) equalization technique can provide efficient nonlinear compensation at a low computational cost and zero data redundancy in coherent optical communication systems. 0.5 dB of BER improvement in 800 km standard single-mode Fiber (SSMF) using 16-QAM and 2 dB of BER in 80 km of SSMF for 64-QAM.16-QAM
Zhang et al. [50]

Table 2.

Machine learning approaches used in optical fiber communications.

4. Oscillator phase noise model

The general model of a noisy signal in the time domain is represented by


where ε(t) is the amplitude modulation (AM) noise, and ϕ(t) is the phase modulation (PM) noise, also known as phase jitter. The normalized carrier power spectrum without noise is given by


The power spectrum of Eq. (1) is shown in Figure 1 with and without noise. The PN causes the instantaneous frequency to “jitter” around the carrier frequency, fc. This leads to the noise sidebands on either side of fc as shown in the picture. An important relation between the coherence time and the PN must be considered here. The coherence time is the time over which a propagating wave may be considered correlated (coherent). In simple words, it is the interval within which the wave phase is, on average, predictable. Therefore, the coherence time can be expressed as

Figure 1.

Power spectrum density for (a) an ideal noise-free oscillator at frequency f0 and (b) a real oscillator with noise bands and non-zero linewidth.


where Δvis the spectral width (Hz) of the source or, in other words, the range within which the frequency “jitters.”

The PN is a random process that is a function of time, but, commonly, its monolateral power spectrum density is considered and represented versus the offset frequency from the carrier, as depicted as an example in Figure 2 . The phase jitter Δφcan be calculated from the power of the PN random process Sϕ(f)by the following formula:

Figure 2.

Example of a phase noise spectrum profile.


usually expressed in radians. The lower and upper integral bound fland fucan be arbitrarily set depending on the way the phase jitter is calculated, as it will be explained in the following. It is worth noticing that the value of the integral corresponds to the PN process variance, and besides, the process to be non-zero only in the range flto fu.

Another important random variable for the analysis of the stability of an oscillator is the time jitter Δτ, which can be calculated from the phase jitter


where fcis the carrier frequency. The timing jitter, usually expressed in seconds (or fractions of seconds), represents the mean variation in time of the wave period, which can change since the wave frequency fluctuates.

For this reason, it can also be expressed in parts per million (ppm) of the period. The timing jitter is divided by the carrier frequency. The integral lower bound of Eq. (4) is fl, i.e., the closest offset frequency from considered to calculate the jitter. Therefore, the inverse of the lower bound frequency 1/flcan be considered as a good approximation, the coherence time tc, because it represents the longest time over which that curve of PN ensures that particular value of jitter. To illustrate the relation between the coherence time tc and the PN, consider Figure 2 . As an example, for a baud rate of 250 MBd (equivalent to 1 Gb/s in 16-QAM), the needed coherence time corresponds is tc ≈ 4 ns, while for 10 GBd it is tc ≈ 100 ps because it is the duration of one symbol. In other words, the required signal stability should be kept for at least one symbol duration.

PN manifests as the jitter on a signal zero crossings. The variance of such (random) zero-crossing jitter is related to the variance of the phase noise itself [51]. One way to deal with the PN effect on the BER consists in measuring the parameter fsym/fc. For large values of this ratio, the noise effect on BER becomes smaller as fsym/fc increases [52]. For that range, one symbol time is short compared with the phase fluctuation period caused by the PN. Hence, the phase fluctuation in one symbol time delayed signal is virtually the same as the fluctuation in the signal before the onset of the delay. Conversely, in small fsym/fc range, the PN effect on BER becomes smaller as fsym/fc decreases. This is because the PN integration value over the receiver noise-equivalent bandwidth becomes smaller due to the filter at the receiver.

As already mentioned, a way to introduce PN in a system to measure its effect is by using larger laser linewidth on the light source, despite this is not this case because the experimental setup that will be reported here used a laser LW of 1 kHz. Therefore, from the laser’s LW point of view, the non-spectral purity (non-monochromatic) of the lasers employed for signal modulation/demodulation in coherent optical systems generally translates into PN impairment in the detected signal. The normalized output of a transmitting laser can be modeled as


wtxand ϕPNare the central emitting frequency and PN of the laser. The PN can be modeled as a Wiener process


With ΔϕPNbeing an independent and identically distributed (i.i.d.) random variable with a mean μ = 0 and variance σΔϕPN2=2πΔfTsΔvis the laser LW, and Tsis the symbol period. When transmission uses higher m-ary modulation formats, the effect of the laser LW must be as low as possible to avoid PN and other impairments arise from combined effects introduced from devices in the system.

Clock signals are required in almost every integrated circuit and some desired characteristics related to their quality must be considered. Digital signal processing is used to perform conversions between analog and digital signals and to process and transmit data at higher data rates and also higher resolutions. The clock signal quality can be described by timing jitter or PN measurements. Therefore, the typical clock PN measurement explores the power density spectrum (PSD) of a clock signal through the above defined time (or period) jitter as the often used jitter measurement parameter.

Through the Fourier series expansion, it can be shown that a square-wave clock signal has the same jitter behavior as its base harmonic sinusoid signal. For this case, a clock sinusoid signal with PN [51] can be written by


And the period jitter being given by


The signal in Eq. (9) is a phase-modulated sinewave by the PN, ϕ(t). Since the PN is usually much smaller than π/2, Eq. (8) can be approximated as


From an operational point of view, therefore, it is possible to set to set a method measuring the PN process observing that the signal sclk(t) mixed with a cos(2πfct) and filtered by a low-pass filter in the frequency domain allows to obtain the PN spectrum Sϕ(f) and its logarithmic version L(f), related by the following formula:


The PN spectrum L(f) defined as the attenuation in dBc/Hz from the peak value Sclk(f) at the clock frequency, fo, to a value of Sclk(f) at fm, as shown in Figure 1b . The mathematical expression that represents the PN spectrum is given by


Phase noise testing characterizes the spectral purity of a LO by calculating the ratio of the desired energy being delivered by the oscillator at the specified output frequency to the amount of undesired energy being delivered at adjacent frequencies. It is a measure of noise power per unit bandwidth appearing at a given offset from the carrier frequency [53]. From Eq. (12), the mean square (MS) of ϕ(t) is equivalent to the PN variance, since it is a zero-mean process, and it can be calculated by


that, finally, shows the relation between the timing jitter Δτand the PN spectrum, L(f), as


The real integral bounds are set to be fl(that is not 0, but with an offset from the carrier) and fu(usually corresponding to the frequency where the PN spectrum enters the noise floor of measurement instrument). An approximated linear piece-wise function using a log scale for the L(f) spectrum can be used to estimate the Δτas presented in [51].

5. Clustering algorithms

According to state of the art, clustering algorithms are presented in this section to understand how we apply the algorithms to identify the change of shape and distortions and to classify each data symbol affected by PN from statistical analysis. Also, like each symbol is surrounded by a decision region, we use the Voronoi diagram to estimate the contour of each symbol, and hence, the last section describes how to apply Voronoi algorithm to obtain flexible thresholding decision boundaries.

5.1 k-means clustering algorithm

k-means is a well-known clustering algorithm applied to perform phase recovery and demodulation processes [54] in a phase-modulated RoF system. It is one unsupervised learning algorithm, whose start point is a predefined and fixed grid on IQ plane. For the modulation formats 16-QAM and 4 + 12 PSK to be evaluated, the partition corresponds to 16 clusters, where each observation is assigned to one of the k clusters defined by centroids according to the nearest mean. The algorithm aims to minimize an objective function known as squared error J and works as follows: for a cluster centroid vi = {v1,…,vc} of the data-set {x1,…,xN}, consisting of N observations, the algorithm [55] first groups randomly selected centroids, which are used as the beginning points for every cluster, and then performs iterative calculations to optimize the positions of the centroids, according to


A data-set of N observations assigned according to this algorithm (7), create different clusters with irregular shapes. Figure 3 shows the general way in which the k-means algorithm works when no vector of centroids is predefined. The first step ( Figure 3a ) is to place centroids vat random locations or referenced locations, i.e., the ideal location of symbols according to the m-ary level of the modulation format used. Then, in the second step (also as in Figure 3a ), each point from xkis assigned to the nearest centroid of the cluster i, by applying some distance metric, e.g., Euclidean distance (ED) between sample xkand cluster centroid vi. After this, the third step ( Figure 3b ) consists of updating each cluster by averaging all the points assigned to cluster iin the previous step. Finally, at the fourth step (also in Figure 3b ), all the points are assigned to the closest cluster, and the centroids are updated until convergence.

Figure 3.

k-means algorithm principle operation. (a) Cluster prototype and data symbol assignation. (b) Centroid location updating until convergence.

5.2 Fuzzy c-means (FCM)

The FCM is an algorithm for clustering that allows to observations belonging to one, two, or more clusters with some membership degree [55]. FCM is based on the minimization of the following objective function


where kis the number of clusters, N is the total number of observations. The squared norm represents the sum of squares of the distances of each data point to its assigned centroid vector ck, for a set of binary indicator variables μik∈{0,1}, known as the degree of membership of the data points xnassociated to the cluster centroids ck. The centroid of the cluster prototype is calculated as


The goal is to find an assignment value of membership for each data point to the cluster where belongs (it means, the belonging degree per symbol for each cluster in the constellation diagram), as well as, the weighted sum of distances of each data point to its closest vector ||ck − vi||2 that minimizes J. In (8) and (9), m > 1 is a real number that governs the influence of individual fuzzy sets in fuzzy partition, and besides, when mgoes toward infinity, all clusters tend to converge to the centroids of the data-set X.

5.3 Fuzzy c-means Gustafson-Kessel (FCM-GK)

The Gustafson-Kessel algorithm [55, 56] associates each sample point with its centroid and its covariance. The main feature of this clustering algorithm is the local adaptation of the distance matrix to identify clusters of different geometrical shapes in one data-set. Each cluster has its own norm-inducing matrix Ai, which yields the following inner product norm:


where viis the mean for that points over cluster i(viis referred as the cluster prototype), and the matrices Aiare used as optimization variables in the c-means functional, besides, allowing each cluster to adapt the distance norm to the local topological structure of the data. The objective functional Jmof GK algorithm is defined as


However, the objective function cannot be directly minimized concerning Ai, because it is linear in Ai, and then to obtain a feasible solution, it is necessary to constrain the determinant of Ai. This procedure allows the matrix Aito vary with its determinant fixed corresponding to the optimization of the cluster’s shape while its volume remains constant:


where ρiis fixed for each cluster. Using the Lagrange multiplier method, the following expression for Aiis obtained:


Being Fithe fuzzy covariance matrix of the ithcluster defined by:


The substitution of Eqs. (13) and (14) in Eq. (10) gives a generalized squared Mahalanobis distance norm between xkand the cluster mean vi, where the covariance is weighted by the membership degrees in the partition matrix U = [μik]. Besides, whereas the original FCM algorithm makes the implicit assumption that the clusters are spherical, the Gustafson-Kessel algorithm is not subject to this constraint and can identify ellipsoidal clusters, due to a size restriction on the covariance matrix whose determinant must be the unit.

In Figure 4 , a graphical description of the FCM-GK algorithm is shown. The general setting for our clustering-based demodulation (for simplicity, we only use two clusters) is applied as follows: (i) the FCM-GK (in general, also k-means and FCM procedures are similar) algorithm requires as inputs, the data, and, the number of clusters vi(predefined value for 16-QAM modulation format). The data symbols correspond to bidimensional (2-D) vectors x(k) = [xi(k) xq(k)], where xiand xqare the in-phase and quadrature (IQ) component samples projected on complex plane. From this data-set, the weighted exponent m, the tolerance criteria ε ≤ 0.001, and the partition matrix U(randomly) are all initialized; (ii) then centroid prototypes are estimated as indicated in (8) and, as shown in Figure 4a ; (iii) after that, membership values are obtained for each data xkwith respect to the closest centroid (see Figure 4b ), calculating the distance norm (k-means and FCM uses Euclidean norm, and, FCM-GK uses Mahalanobis distance norm, but constrained to fixed determinant of A equal to 1); (iv) the matrix fuzzy partition U is updated (see Figure 4c ); and, finally, (v) the criterion for termination is calculated as ||U(l) − U(l − 1)|| < ε, if convergence is achieved, the algorithm stops, if not, returns to step iii. Besides, we obtain Voronoi contours [57] to define the nonlinear decision boundaries without considering the shape or orientation of the symbol within the constellation diagram. This effect is relevant especially for the symbols in the corners, which are the most severely impacted by PN, as it is shown in Figure 4d , and explained as follow.

Figure 4.

Fuzzy c-means GK step-by-step algorithm. (a) Centroid of cluster initialization at predefined locations and membership value estimation, (b) the association of each data point to the clusters, (c) distance norm computation and fuzzification stage, and as a final step, (d) Voronoi contour to define the non-Gaussian boundaries for each cluster distorted by PN.

5.4 Voronoi contours for flexible thresholding

The Voronoi diagram by contours has been used to assist the clustering-based demodulation stage when symbol distortion is severe [11]. An approximation to its definition may be introduced as the proximity regions to a set of objects in a bidimensional plane. As an example, Figure 5 shows a Voronoi diagram for a 16-QAM and 4 + 12 PSK constellation defining the boundaries of the data symbols. This technique adds flexibility to the demodulation process because non-Gaussian boundaries caused by PN are not trivial to estimate and require nonlinear equalizers highly time-consuming and computationally expensive. In the 4 + 12 PSK modulation scheme, due to the lower ED between symbols in the external ring, the overlapping on the symbol clusters is evident, and as a direct consequence, the symbol error rate is increased at the demodulation stage. Then, an arbitrary symbol point is assigned to the site (single point—centroid—of a region inside a Voronoi partition that meets certain properties [58]) with the minimal Euclidean distance. The set of all symbol points with the same assigned site builds the Voronoi region (VR). In our case, for both modulation schemes, we have 16 sites leading to more Voronoi edges (VEs), these VEs are defined by the points which hold the same ED for two sites. But, in general, there are more than two sites requiring at least one Voronoi vertices (VV), which represents a symbol point with the same minimum Euclidean distance toward the 16 sites. In this way, the Voronoi diagram resulting as can be seen in Figure 5 on the right, defining the contours properly for all the sites corresponding to the 16 symbol points of the constellation.

Figure 5.

Clustering using FCM Gustafson-Kessel with Voronoi contour for 16-QAM (up) and 4 + 12 PSK (down) modulation formats.


6. Experimental setup

Figure 6 shows the schematic diagram for the RoF system setup as single channel and single polarization. A pseudorandom binary sequence (PRBS) with length 218 at a baud rate of 250 MBd was generated to deliver 1 Gb/s in a single channel and single polarization case. The pulse shaping was implemented using a finite impulse response (FIR) root-raised-cosine (RRC) filter with eight taps and a commonly used roll-off factor α = 0.2. After that, the modulation process was performed encoding 4 bits per symbol for both 16-QAM and 4 + 12 PSK. This signal was digitized through a Fujitsu LEIA board with a sampling frequency of 64 GSa/s with an embedded up-conversion stage that translated the baseband signal to a RF carrier at 6 GHz. The RF signal was fed into the Mach-Zehnder modulator (MZM) RF inputs to modulate a continuous wave (CW) laser. The obtained optical signal was propagated over a spool of fiber with length 78.8 km. The employed optical source was a distributed feedback (DFB) laser (LW of 100 kHz) emitting at 1550 nm, which was amplified by an EDFA before the electro-optical MZM. To test the channel performance against noise, ASE noise was generated over a bandwidth of 3.5 nm centered around the CW wavelength and injected in the channel before the receiver. The received signal was pre-amplified and filtered by an optical bandpass filter (OBPF) with a bandwidth of 3.5 nm and photodetected by a Nortel PP-10G photoreceiver with an electrical bandwidth of 13 GHz. A variable optical attenuator (VOA) was adjusted to set the power at the input of the photodetector at −5 dBm. Then, the electrical signal was entered into a digital oscilloscope with a sampling rate of 25 GSa/s, and the captured data were stored for offline processing. Homodyne detection was performed at the receiver, and the frequency difference at the transmitter between the LO and the optical carrier was not compensated. This mismatch introduces PN distortion on data symbols, mainly on outer symbols like the one shown in Figure 5 . After the data acquisition, the down-conversion stage translated the RF signal to baseband; and then, the matched filtering with the same prototype characteristics of the transmitter side was applied. Besides, in the demodulation stage, the conventional rectangular grid demodulations, i.e., k-means, FCM, and FCM-GK clustering-based methods are evaluated varying the OSNR from 16 to 36 dB, as a function of bit error rate (BER). The FCM-GK demodulation process is aided with the Voronoi diagrams to estimate the non-Gaussian boundary per symbol using different step-size per contour line.

Figure 6.

RoF experimental setup.

7. Analysis of the results

Observing the 16-QAM constellation of Figure 5 , the shape of symbols mainly in the closest to corners is quasi-ellipsoidal: this non-Gaussian shape is due to the hybrid channel nonlinearities jointly combined with NLPN, conversely, the inner clusters of symbols follows a circular behavior. It is important to highlight that the main distortions introduced by the PN are proportional to the noise variance (power) of each symbol, being this effect consistent with the higher amplitude of the external symbols as shown in Figure 7 . In this figure, three rings can be clearly identified. Here, a ring is defined as the circumference along which symbols are distributed, and it is associated with an energy level on a circumference where symbols are distributed. Considering this non-Gaussian shape, and taking advantage of the clustering technique, it was possible to perform an individual symbol characterization through statistical analysis. In Figure 8 , we show a 16-QAM constellation, where fragmentation per ring and symbol has been applied. We perform an individual analysis of each cluster by estimating the mean, standard deviation, variance, histogram, and contour. For example, for symbol “1001” (Gray coding), we obtain a mean of −0.9530 and a variance of 0.1986; this inner symbol grows with a circular shape like observed in an additive white Gaussian Noise (AWGN) channel. But for the symbol at the corner “0000,” the shape is ellipsoidal, and the normal distribution has a higher variance, i.e. dispersion, and variance of the symbols. With these features, we can choose the appropriate clustering algorithm to mitigate the PN impact on specific symbols, i.e., by applying FCM-GK for detecting clusters with ellipsoidal shape, and k-means or conventional demodulation method based on a rectangular fixed grid in inner symbols, where the distortion is minor. As observed in the histograms of Figure 8 , the symbol toward close to the center of the constellation diagram (i.e., the symbol 1111) has a Gaussian shape with lower standard deviation (σ = 0.1986) than the symbol on the corner. In the same way, the symbol 0000 spreads the Gaussian bell on the top and has a higher standard deviation value of 0.3073. This analysis allows to conclude that lower variance (or, equivalently, lower standard deviation) values correspond to more compact clusters that have circular shape, comparable with the cluster that would be obtained when only AWGN is present.

Figure 7.

PN impact over symbols distributed over rings for a 16-QAM constellation.

Figure 8.

Independent statistical analysis on each symbol of constellation diagram.

The Voronoi contour shows a more defined Gaussian shape for the inner symbol and an elongated non-Gaussian shape for the external symbol. This statistical analysis is performed for the different OSNR levels measured in the RoF experimental setup, and the combined effect of data symbol clustering with Voronoi contour enhances the demodulation stage, either using a direct detection [26] or coherent receiver. Besides, by observing the constellation characterization, a residual phase offset was detected on external symbols of constellations, then using the clustering fragmentation a the dual-stage phase offset compensation (DS-PS-RF) proposed method in [9], an additional improvement of BER can be obtained when conventional demodulation is applied with rectangular decision boundaries.

In Figure 9 , we plot the BER performance vs. OSNR for 16-QAM at the error-free condition after the forward error correction (FEC) encoding that sets the error-free threshold at 2 × 10−2. The line with blue color and circle marker corresponds to the conventional demodulation method; the error performance for FCM is the curve in orange color with a diamond marker. The other curve, with a square red marker, corresponds to the k-means algorithm, and, finally, with the asterisk green marker, the FCM-GK method is plotted. k-means and FCM-GK perform a similar gain of 2.1 dB in the OSNR scale comparing with conventional demodulation of 16-QAM; however, beyond 26 dB, the gain margin for FCM-GK over k-means holds around 1 dB for the rest of the curve up to 36 dB. Besides, for FCM-GK assisted with Voronoi contour, the OSNR achieves a gain of 3.2 dB.

Figure 9.

BER performance for 16-QAM at 1 Gb/s after the 78.8 km span.

The constellation inset shows the data symbol distribution for FCM-GK obtaining an error vector magnitude (EVM) of 20.1% for 16-QAM. On the other hand, in Figure 10 , the BER performance results are presented for 4 + 12 PSK modulation format. An improvement in the OSNR scale of 1.4 dB for FCM-GK compared with the conventional demodulation method is observed in Figure 10 , besides in the inset, an EVM of 20.9% for the 4 + 12 PSK is obtained at 22 dB of OSNR. The curve for the FCM algorithm shows the worst performance compared with the other methods. These effects are possibly due to distortions on the shape of the clusters, which are ellipsoidal and have higher overlaps due to lower Euclidean distance among clusters. Similar performance curves are shown for k-means and conventional demodulation algorithms.

Figure 10.

BER performance for 4 + 12 PSK at 1 Gb/s after the 78.8 km span.

The decision boundaries obtained from Voronoi contours for each cluster define non-Gaussian regions; then, when the inner symbols grow circularly (following a Gaussian shape), the conventional fixed grid or k-means demodulation techniques must be used but, in contrast, when outer symbols take ellipsoidal shape, the FCM-GK provides higher tolerance to distortions and improves the BER and OSNR. Also, it is evident that to optimize the geometric shaping is a non-trivial task because the Euclidean distance must be maximized by performing a nonconventional distribution of symbols, besides limiting the symbol energy level to avoid power penalties concerning conventional demodulation formats. However, to have an idea of the non-Gaussian behavior due to NLPN, Figure 11a and b and represent the 3D behavior of the received data symbols showing how the distortions impact and change the shape of the symbols. Besides, a residual phase offset of centroids mainly over external clusters shows that ideal centroids are shifted to newer locations over the plane as demonstrated experimentally in [34]. Additionally, it is possible to improve error performance using a second stage to estimate amplitude changes and phase offset to compensate these residual shifts observed in the constellation. The a priori initialization of centroids with a fixed number of clusters avoids introducing a higher computational time of the evaluated algorithms. Besides, computational complexity in comparison with linear and nonlinear equalizers is an issue to be further explored in a future work. Additional specific experimental studies should be conducted to validate with more data the EVM/SNR estimation, and to define the impact of key aspects such as the shape of the clusters, or the performance for different distance norms, and the evaluation of optimized algorithms derived from conventional k-means and FCM.

Figure 11.

(a) Data symbols under AWGN, and (b) data symbols under non-Gaussian distortion.

Finally, the lower time-computing because of centroid initialization compensates the time spent in the calculation of inverse matrices required during the execution of Gustafson-Kessel algorithm. However, further studies on different optimization techniques must be conducted, e.g. for smaller OSNR values and using nonconventional modulation formats, implementing them in real-time scenarios, for a deeper analysis of the processing time issue.

8. Conclusions

We proposed and experimentally evaluated a constellation-based tool for demodulation, aided by Voronoi thresholding, clustering fragmentation, and signal demodulation, validated with 16-QAM and 4 + 12 PSK constellations, distorted especially with nonlinear phase noise. Fuzzy c-means Gustafson-Kessel was evaluated as an alternative to performing the demodulation-based clustering, taking advantage of the improved identification of clustering with non-Gaussian shapes enabled by the FCM-GK. The non-Gaussian boundaries for symbol demodulation were estimated using contour Voronoi diagrams, obtaining improvements of 3.2 dB for 16-QAM and 1.4 dB for 4 + 12 PSK using the fuzzy c-means Gustafson-Kessel algorithm. We compared the error performance for different OSNRs as a function of the BER at a threshold level of 2 × 10−2. Our fragmented constellation in combination with Voronoi thresholding method mitigates distortive effects over data symbols, performing localized transmission impairments characterization by applying the individual analysis of each cluster of symbols or subconstellations (quadrant in the constellation diagram), without requiring frequency and phase carrier recovery algorithms. Besides, this method can be extended to the coherent receiver avoiding the time-consuming post-compensation algorithms to equalize the received signals. However, if during the characterization stage the distortions of the received constellation have Gaussian behavior, employing clustering-based demodulation using k-means or the conventional demodulation algorithm with fixed grid provides similar error performance as demonstrated, but the trade-off is focused on the computational effort. But if non-Gaussian shapes are identified over the constellation, then a better approach to improve the system performance is to use the FCM-GK method aided with a Voronoi diagram. Finally, a new proposal for estimating the EVM reduces the number of symbols required to calculate the error metric; this reduces the computational time because only external symbols are required to approach the result.


This work has been supported by the Grant 617 of Colciencias and the Universidad de Antioquia. A special thanks to the Universidad Pedagógica y Tecnológica de Colombia (UPTC) for their financial support to encourage my doctoral studies by agreement 041. The reported activity has resulted from experimental setups conducted at the Photonics and Microwave Laboratory of Scuola Sant’Anna at Pisa, Italy, and are supported by the EU ITN project FiWin5G No. 642355. Project PREVENTION with the contribution of the Italian Ministry of Foreign Affairs, General Directorate for the Country Promotion.

© 2019 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution 3.0 License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

How to cite and reference

Link to this chapter Copy to clipboard

Cite this chapter Copy to clipboard

Eduardo Avendaño Fernández, Ana María Cárdenas Soto, Neil Guerrero Gonzalez, Giovanni Serafino, Paolo Ghelfi and Antonella Bogoni (December 13th 2019). Machine Learning Techniques to Mitigate Nonlinear Phase Noise in Moderate Baud Rate Optical Communication Systems, Intelligent System and Computing, Yang (Cindy) Yi, IntechOpen, DOI: 10.5772/intechopen.88871. Available from:

chapter statistics

338total chapter downloads

More statistics for editors and authors

Login to your personal dashboard for more detailed statistics on your publications.

Access personal reporting

Related Content

This Book

Next chapter

Deep Learning Based Prediction of Transfer Probability of Shared Bikes Data

By Wenwen Tu

Related Book

First chapter

Information Management and Video Analytics: the Future of Intelligent Video Surveillance

By Bennie Coetzer, Jaco van der Merwe and Bradley Josephs

We are IntechOpen, the world's leading publisher of Open Access books. Built by scientists, for scientists. Our readership spans scientists, professors, researchers, librarians, and students, as well as business professionals. We share our knowledge and peer-reveiwed research papers with libraries, scientific and engineering societies, and also work with corporate R&D departments and government entities.

More About Us