## 1. Introduction

Optical fiber nonlinearity is the main factor limiting the transmission distance of coherent optical communications in many cases [1, 2, 3]. This is an especially critical issue for low dispersion fiber case, since the waveforms stay in the original shape for a longer time and the nonlinearity effect is enhanced. In addition, in the dense wavelength multiplex division (DWDM) systems, multiple wavelength travels nearly at the same speed, so there is less opportunity of averaging out the nonlinear effects. Typical examples are legacy submarine cable systems [4].

At the same time, more service providers prefer flexible networks where adaptive transceivers can operate multiple data rates, modulation formats, and forward error correction (FEC) overheads for the efficient use of network capacity [5, 6, 7, 8]. For example, it has been shown that the mean loss in throughput per transceiver is nearly proportional to the granularity of the data rate [8]. In order to accommodate a wide range of channel conditions, many modulation formats with various spectral efficiencies have been studied extensively [1, 9, 10, 11, 12, 13, 14, 15, 16, 17].

In order to mitigate or minimize the penalty from the nonlinear effects, efforts have been made to optimize the modulation formats, which are intrinsically tolerant to optical fiber nonlinearity [4, 18, 19, 20]. In other words, the modulation format is constructed such that they are less susceptible to fiber nonlinearity. In Section 2, we discuss the so-called X-constellation, which is an eight-dimensional (8D) code and has higher nonlinearity tolerance than dual-polarization (DP)-binary phase shift keying (BPSK) format of the same spectral efficiency of 2 bits/4D symbol. This significantly reduces cross polarization modulation (XPolM) and increases the transmission distance. Note that we use bits/4D symbol as a unit of spectral efficiency throughout this chapter.

Another method for reducing the fiber nonlinearity is 4D constant modulus modulation. The power of combined x- and y-polarization is constant at each time slot. This very effectively suppresses self-phase modulation (SPM) and cross-phase modulation (CPM). We discuss two earlier 4D constant modulus modulation formats [19, 20] in Section 3. More recently proposed 4D-2A8PSK is another example of the 4D constant modulus modulation. It has been widely recognized that DP-Star-8QAM for 6 bits/symbol does not perform very well, and many formats have been investigated [21, 22, 23, 24, 25] for this spectral efficiency. In particular, 4D-2A8PSK with the spectral efficiency of 6 bits/symbol has been shown to have superior linear and nonlinear performance than many other formats, because of its large Euclidean distance, 4D constant modulus characteristics (i.e., constant power in each time slot), and Gray labeling [26, 27, 28]. By relating this to the block coding approach described in the context of high-dimensional modulation [14], 5, 6, and 7 bits/symbol modulation formats were described as a family of block-coded 4D-2A8PSK in a unified form in [29, 30]. In Section 4, we first describe the 4D-2A8PSK modulation format family for 5–7 bits/symbol spectral efficiencies. Transmission simulations in a nonlinear dispersion-managed (DM) link, as well as a dispersion-uncompensated link are performed, so that this proposed family of modulation formats can be confirmed to exhibit excellent linear and nonlinear transmission performances. An analysis on separated nonlinear components demonstrate that reduction of self-phase modulation (SPM) and cross-phase modulation (XPM) are the main causes of the improvement by the 4D constant modulus formats. We then review some items relevant to practical implementations in a digital signal processor (DSP), which need careful consideration since the constellation of the 2A8PSK family is different from that of the widely used QAM-based formats. Experimental verification, extension to TDH to seamlessly cover 4–8 bits/symbol spectral efficiency, as well as the combination of Grassmann code and 4D-2A8PSK for the spectral efficiency of 3.5 bits/symbol, will also be reviewed.

## 2. X-constellation

For coherent optical communications, block codes with 8–24 dimensions have been proposed to achieve coding gain compared to the conventional 2D modulation formats [14, 31, 32, 33, 34, 35]. For example, using eight bit block code (four information bits and four parity bits) achieved almost 3 dB asymptotic gain. However, they are not specifically designed for nonlinearity-tolerant modulation. Shiner et al. [4] used the 8D code to achieve the coding gain and also arranged the code such that the degree of polarization (DOP) over a symbol (two time slots) becomes zero, as shown in **Table 1**. This significantly reduces the impact of polarization change toward other channels, as well as receiving polarization effect from other channels. The authors call this modulation format “X-constellation,” since the constellation of Slot-A and Slot-B are cross-polarized.

The authors conducted a transmission experiment using 5000 km of dispersion-managed high density wavelength division multiplexing (WDM) link and compared the Q-factor of DP-BPSK and X-constellation. **Figure 1** shows that X-constellation showed 2 dB improvement in the Q-factor demonstrating the benefit of the X-constellation.

## 3. 4D constant modulus formats

DP-QPSK is a very robust modulation format for nonlinear transmission, and one of the main reasons is that it is 2D constant modulus, in that the power for each polarization is constant. 2D constant modulus property can also be achieved in DP-8PSK, DP-16PSK, etc.; however, their Euclidean distance is smaller than that of DP-Star-8PSK and DP-16QAM, and overall transmission characteristics are not as good. Instead, 4D constant modulus modulation formats (i.e., power of the combined X- and Y-polarizations is constant) were proposed. One of the 4D constant modulus format is 8PolSK-QPSK [19], in which eight polarization states in the Stokes space representation carry four different absolute phases, as shown in **Figure 2**. This gives 32 code words or 5 bits/symbol spectral efficiency. Compared to DP-Star-8QAM, 8PolSK-QPSK showed significantly reduced SPM and XPM.

Another example of 4D constant modulus format is POL-QAM 6–4, where six polarization states carry four different absolute phases [20]. This gives 24 code words.

In the next section, we review a family of 4D-2A8PSK modulation formats, which are also 4D constant modulus formats, covering multiple spectral efficiencies and having large coding gain.

## 4. 4D-2A8PSK

### 4.1. Generalized mutual information (GMI)

As a first step, we give an overview of a metric in order to compare the modulation formats under the most relevant condition. Pre-FEC bit error ratio (BER) has traditionally been used to predict post-FEC BER performance of hard decision (HD) FEC systems. However, pre-FEC BER is not directly applicable to modern long distance fiber-optic communications using soft decision (SD) FEC based on bit-interleaved coded modulation (BICM). As an alternative performance metric more suitable for SD-FEC systems, the BICM limit, called generalized mutual information (GMI), was introduced to the optical communications research community for comparing different modulation formats [36, 37]. This metric has been used to compare several modulation formats [23, 38]. The normalized GMI (i.e., GMI per bit) can be described from the log-likelihood ratio (LLR) outputs of the demodulator at the receiver as follows [39, 40, 41]:

where

The relationship between Q-factor calculated from pre-FEC BER and normalized GMI of four different modulation formats (DP-QPSK, D P-Star-8QAM, 6b4D-2A8PSK, and DP-16QAM) is shown in **Figure 3**. We will give a detailed explanation of the 6b4D-2A8PSK modulation format in Section 4.4. Here, the Q-factor is defined by

which is a classical measure to calculate the required signal-to-noise ratio (SNR) to achieve the BER for binary-input additive white Gaussian noise (AWGN) channels. Here, **Figure 3** shows that the same pre-FEC BER (Q-factor) does not necessarily give the same BICM limit among various formats, especially at lower code rate regions. When the normalized GMI is 0.85, the Q-factor lies between 4.77 (BER = 4.16 × 10^{−2}) and 4.86 dB (BER = 4.01 × 10^{−2}), corresponding to the typical

### 4.2. Generic 2A8PSK

The generic constellation of 4D-2A8PSK [26, 27, 28, 29] is shown in **Figure 4**. It is similar to 8PSK, with two different amplitudes represented by the radii, ^{8} = 256 combinations (i.e., 8 bits per 4D symbol) are possible. With a condition that X- and Y-polarizations have complimentary radius, that is, if

The mapping rule of 4D-2A8PSK is also included in **Figure 4** [44]. Let

### 4.3. 5b4D-2A8PSK

For the spectral efficiency of 5 bits/symbol, 32SP-2A8PSK (5b4D-2A8PSK) can be expressed by a linear code, with five information bits ^{10} = 1024 is the total number of possible linear codes to be designed. We chose the combination, which gives the least required SNR for the target GMI of 0.85, through Monte-Carlo simulations in AWGN.

In order to maintain a 4D constant modulus property, for each code word, the X-polarization ring size is always complementary to the Y-polarization ring size. Negating another parity bit

where

### 4.4. 6b4D-2A8PSK

In 64SP-2A8PSK (6b4D-2A8PSK), which has 6 bits/symbol spectral efficiency,

### 4.5. 7b4D-2A8PSK

For the spectral efficiency of 7 bits/symbol, 128SP-2A8PSK (7b4D-2A8PSK) can be constructed simply as follows. In this code,

### 4.6. Other modulation formats for comparison

In order to evaluate the performance of 5b4D-2A8PSK, we consider three other modulation formats having 5 bits/symbol spectral efficiency, that is, 8PolSK-QPSK [19], 32SP-16QAM [11], and time-domain hybrid (TDH) modulation. 8PolSK-QPSK [19] was briefly explained in Section 3. 32SP-16QAM is a 4D set-partitioned modulation format derived from DP-16QAM. To generate 32 code words, the parity rule shown in [11] is used. TDH modulation using a 1:1 mixture of DP-QPSK and 6b4D-2A8PSK to achieve an average of 5 bits/symbol spectral efficiency is also included.

For comparison with 6b4D-2A8PSK, three other modulation formats of 6 bits/symbol spectral efficiency were evaluated; specifically, DP-8PSK, DP-Star-8QAM, and DP-Circular-8QAM [23]. DP-8PSK and DP-Star-8QAM are conventional modulation formats. DP-Circular-8QAM has one center point and seven circular constellation points, and has larger Euclidean distance than DP-8PSK [23].

To compare with 7b4D-2A8PSK, two modulation formats of 7 bits/symbol spectral efficiency are evaluated. 128SP-16QAM is a 4D modulation format based on DP-16QAM, where 128 code words are generated using the parity rule described in [11]. We also included TDH modulation using 1:1 mixture of 6b4D-2A8PSK and DP-16QAM. Furthermore, we included DP-16QAM; however, to compare for the same data rate, we used the Baud rate of (7/8) × 34 GBd.

### 4.7. Nonlinear transmission simulations

#### 4.7.1. Simulation procedure

Nonlinear transmission simulations are conducted over a 2000 km DM link at a rate of 34 GBd per channel to evaluate the effect of modulation format on high fiber nonlinearity. At the transmitter, pulses were filtered by a root-raised-cosine (RRC) filter with a roll-off factor of 10%. Eleven DWDM channels of the same modulation format were combined with 37.5 GHz spacing without using any optical filtering. The link consists of 25 spans of 80 km nonzero dispersion shifted fiber (NZDSF) in which loss is compensated by Erbium-doped fiber amplifiers (EDFAs). The performance of each modulation format can be quantified by a span loss budget, which is defined as [45]

where

The parameters for NZDSF were

An ideal homodyne coherent receiver was used, with an RRC filter with a roll-off factor of 10%, followed by sampling at twice the symbol rate. For adaptive equalization, we used a time-domain data-aided least-mean-square equalizer utilizing the transmitted data directly as the training sequences for simplicity. A discussion on a more realistic equalizer will be given in Section 4.9. We did not use carrier phase estimation (CPE) in Sections 4.7 and 4.8.

All the optical noise due to the EDFAs are loaded just before the receiver. The calculated required OSNR at the target GMI is used to obtain the span loss budget as in (9). We used an EDFA noise figure of 5 dB to calculate the span loss budget.

#### 4.7.2. 5 bits/symbol modulation formats

Four 5 bit/symbol formats are compared as shown in **Figure 5**. In this case, we use the ring ratio of

As the launch power increases, the span loss budget for 32SP-16QAM decreases fast, because of large power variations at each time slot. On the other hand, 8PolSK-QPSK [19] has 0.65 dB worse OSNR for the linear case, while the saturation characteristics are very similar to 5b4D-2A8PSK due to constant power. TDH modulation with a 1:1 mixture of DP-QPSK and 6b4D-2A8PSK has 4D constant modulus property at each time slot. However, we used an optimized power allocation for TDH modulation (i.e., 6b4D-2A8PSK has 2.7 dB higher power than DP-QPSK), and there is a power variation between time slots generating some penalty due to the nonlinearity.

Overall, 5b4D-2A8PSK has the higher maximum span loss budget by 0.5 dB over the TDH modulation, by 0.9 dB over 8PolSK-QPSK, and by 1.8 dB over 32SP-16QAM.

#### 4.7.3. 6 bits/symbol modulation formats

Four 6 bits/symbol modulation formats are compared as in **Figure 6**. The optimal ring ratio is

#### 4.7.4. 7 bits/symbol modulation formats

**Figure 7** shows performance comparison among three 7 bit/symbol formats at 34 GBd and DP-16QAM of the same data rate (

#### 4.7.5. Summary for the dispersion managed link results

The peak span loss budget for the DM link is summarized in **Figure 8**. The circles connected by the dashed lines include DP-QPSK, 5b4D-, 6b4D-, 7b4D-2A8PSK, and DP-16QAM, all at 34 GBd. Squares are taken from TDH modulation formats, and triangles are from other (conventional) modulation formats in **Figures 5**–**7**. This shows that the 4D-2A8PSK family fills the gap between DP-QPSK and DP-16QAM almost linearly (in the dB scale), and each one offers a good improvement from the conventional modulation formats at the same spectral efficiency.

#### 4.7.6. 5 bits/symbol under dispersion uncompensated link

For evaluating the transmission characteristics of various modulation formats under a reduced nonlinearity situation, representing terrestrial cases, we also simulated the link with 50 spans of 80 km standard single-mode fiber (SSMF) without inline dispersion compensation or dispersion pre-compensation. SSMF parameters are **Figure 9**, as an example. The overall differences among the modulation formats are smaller than the case of DM-NZDSF link. 5b4D-2A8PSK shows the highest performance with the peak span loss budget outperforming those of TDH of DP-QPSK and 6b4D-2A8PSK, 8PolSK, and 32SP-8QAM by 0.2, 1.0, and 0.8 dB, respectively. TDH and 32SP-16QAM in the dispersion uncompensated link case did not suffer as much as they did in the DM case. The reason is the weaker nonlinear distortion in the uncompensated SSMF links compared to DM-NZDSF links.

### 4.8. Separated nonlinearity

For better understanding, the reason of the outperformance of 4D constant modulus modulation, additional simulations are conducted with separated nonlinear components, using the method proposed in [46]. With this method, the nonlinear transmission performance with nonlinear effects of SPM, XPM, and XPolM can be evaluated individually.

**Figure 10** shows the simulated Q-factor as a function of OSNR for 6b4D-2A8PSK and DP-Star-8QAM in the DM link, where the simulation parameters are kept the same as in Section 4.7.1. Here, we use the recently proposed Q-factor definition based on GMI and not on pre-FEC BER as follows [44]:

where

(11) |

The Q-factor defines the above based on GMI in Eq. (10) is a generalized extension from the conventional Q-factor based on BER in (2). With this new Q-factor, the effective SNR to achieve same post-FEC BER performance with SD-FEC systems can be evaluated. Even though both definitions provide identical Q performance in binary-input AWGN channels, the generalized Q-factor can predict SNR gain more accurately than the conventional BER-based Q-factor for BICM systems using SD-FEC coding and/or high-order high-dimensional modulation.

The curves marked with “AWGN” in **Figure 10** indicate the case in which the nonlinear effects are fully ignored, and the curves depicted with “SPM,” “XPM,” and “XPolM” show that these nonlinear components are individually added. The curve with “SPM + XPM + XPolM” shows the situation when all of these nonlinear effects are taken into account. The launch power is set to be −4 dBm, giving the peak span loss budget for DP-Star-8QAM. At this launch power, OSNR of 15.2 dB gives a normalized GMI of 0.85.

**Figure 11** is a re-plot of the simulated Q versus separated nonlinear effects when the OSNR is 15.2 dB. Q under the linear condition (AWGN) for 6b4D-2A8PSK is higher than DP-Star-8QAM by 0.4 dB. The contributions from SPM and XPM in 6b4D-2A8PSK are much smaller than those in DP-Star-8QAM. This confirms that 4D-2A8PSK family can be robust against XPM and SPM nonlinearity. However, the contribution of XPolM is similar in 6b4D-2A8PSK and DP-Star-8QAM. This is due to the fact that individual polarization power in 6b4D-2A8PSK fluctuates over symbol time even though the combined power of both polarizations is constant. This is consistent with a report in which another 4D constant modulus modulation format 8PolSK shows a significant reduction in SPM and XPM, but not necessarily in XPolM [19].

### 4.9. DSP algorithm

#### 4.9.1. Adaptive equalizer

To understand the fundamental benefit of the 4D-2A8PSK family, we used an idealized data-directed least-mean-square equalizer up to this point. In this section, we address the performance impact when more realistic equalizers [47, 48] are used, considering practical implementations into account.

We first consider a conventional radius-directed equalizer (RDE) [48] for 6b4D-2A8PSK, where the decision on the ring radii is performed at each polarization separately. In this case, we observe 0.12 and 0.10 dB degradation in the span loss budget, in comparison to the idealized least-mean-square equalizer at a launch power of −10 and −4 dBm, respectively.

We then take advantage of the 4D constant modulus property, by using the relative power of two polarizations for soft decision of the ring radii. For soft decision information, we use a heuristic sigmoid function

#### 4.9.2. LLR computation

For SD-FEC, it is necessary to calculate log-likelihood ratio (LLR) with moderate circuit complexity. A fast-decoding algorithm and LLR computation for high-order set-partitioned 4D–QAM formats [49] is now extended to 6b4D-2A8PSK to use two lookup tables [44]. The schematic of the soft-demapping circuit is shown in **Figure 12**, and also used the asymmetry between the radial and the axial LLR and the offline processing of the experimental data showed only a small power penalty [44]. It also used the LLR calculation method robust against residual phase noise [50].

### 4.10. Experiment

We have also conducted a transmission experiment comparing 6b4D-2A8PSK and DP-Star-8QAM [44]. The signals were either 6b4D-2A8PSK or DP-Star-8QAM modulated at 32 GBd and filtered with a root-raised cosine filter with a roll-off factor of 0.15. Seventy channels were spaced at 50 GHz spacing. The transmission line was 1260 km, having an average span length of 70 km. Chromatic dispersion was managed inline by the mixture of nonzero dispersion shifted fiber (NZDSF) having negative local CD of −3 ps/nm and standard single-mode fiber (SSMF). In the receiver side, the signal stored by 64 GS/s analog-to-digital converters (ADCs) was processed offline, which included CD compensation, adaptive equalization with constant modulus algorithm for initial convergence, and radius directed equalization afterward, carrier recovery (CR) with multipilot algorithm [47] having an window size of 63, pilot-aided phase-slip recovery, and the proposed soft-demapping as described in Section 4–9.

**Figures 13** and **14** show the experimental results, **Figure 14(a)** is Q from GMI as a function of launched power. In the case of ideal soft-demapping (only 16 level quantization for SD-FEC decoding was applied), we observed 0.6 dB improvement at maximum Q by 4D-2A8PSK compared to DP-Star-8QAM. The proposed technique had performance degradation of 0.15 and 0.06 dB for 4D-2A8PSK and DP-Star-8QAM, respectively, compared to the ideal LLR. The overall performance gain of 0.5 dB was still significant in the highly nonlinear transmissions. **Figure 14(b)** shows required OSNR, which was calculated by loading noise at the receiver DSP to emulate OSNR decrease. The target normalized GMI was set to 0.92, which was close to 20.5% SD-FEC limit [53]. The proposed soft-demapping worked even at such low OSNR conditions and 4D-2A8PSK outperformed DP-Star-8QAM as the launched power increase.

### 4.11. Time domain hybrid modulation

TDH modulation has been studied considerably to cover a wide range of channel conditions, due to its flexibility in choosing the nearly arbitrary spectral efficiency [12, 51, 52]. As the constituent modulation formats, we use DP-QPSK (4 bits/symbol) and QP-16QAM (8 bits/symbol) in conjunction with 5b4D, 6b4D, and 7b4D-2A8PSK to widen the range of TDH [54]. For a comparison, we also use TDH modulation using conventional modulation formats, that is, DP-QPSK, 32SP-QAM, DP-Star-8QAM (S8QAM), 128SP-QAM, and DP-16QAM. The benefit of the 4D-2A8PSK family is the 4D constant modulus property. In other words, there is no compromise in choosing the power ratio (ratio between the two modulation formats). On the other hand, conventional formats experience power fluctuations, causing compromise in the power ratio [29, 54].

We simulated transmission performance over the same link condition as described in Section 4.7. For 5b4D, 6b4D, and 7b4D-2A8PSK formats, we choose the ring ratio of 0.60, 0.65, and 0.59 for the best nonlinear performance. For all the THD modulation, we use 1:1 ratio with alternating formats; however, in actual systems, any arbitrary ratio can be used. The important parameter for TDH is the power ratio, that is, how much power will be allocated for each time slot. We optimize the power ratio for the best nonlinear performance.

**Figure 15** shows the calculated span loss budget for 4–6 bits/symbol modulation formats, including the 2A8PSK-based and the conventional TDH modulation. DP-QPSK and 6b4D-2A8PSK data are also included as a reference. **Figure 16** shows the span loss budget for 6.5–8 bits/symbol modulation formats. From these figures, we can see that the TDH modulation based on 2A8PSK has much better nonlinear performance than that based on the conventional modulation formats, due to their constant modulus property.

The peak span loss budget for various spectral efficiencies is shown in **Figure 17**. Here, 4.5, 5.5, 6.5, and 7.5 bits/symbol TDH based on 2A8PSK used DP-QPSK, 5b4D-2A8PSK, 6b4D-2A8PSK, 7b4D-2A8PSK, and DP-16QAM. TDH based on the conventional formats used DP-QPSK, 32SP-QAM, S8QAM, 128SP-QAM, and DP-16QAM. We observed 1.3, 1.6, 1.6, and 0.6 dB increase in peak span loss budget, when TDH used 2A8PSK, at 4.5, 5.5, 6.5, and 7.5 bits/symbol, respectively. This shows the versatility of the 4D-2A8PSK family.

### 4.12. 3.5 bits/symbol modulation format

Grassmann code [4, 55] is known to be robust against state of polarization (SOP) rotation including cross polarization modulation (XPolM) as described in Section 2. We investigated a Grassmann code-based 7-bit 8D code [56], whose schematic is shown in **Figure 18**. 2-ary amplitude QPSK (2AQPSK) and 2A8PSK are used for the first and the second time slots, respectively.

We compared 7b8D-2A8PSK (3.5 bits/symbol), PS-QPSK (**Figure 19**. 7b8D-Grassmann format exhibits almost the same span loss budget as PS-QPSK, which has higher Baud rate and broader spectrum. On the other hand, 7b8D-Grassmann format shows much larger span loss than DP-QPSK, although the latter has narrower spectrum. Therefore, depending on the application, 7b8D-Grassmann format may be an alternative to PS-QPSK and DP-QPSK.

## 5. Conclusion

We reviewed nonlinearity-tolerant modulation formats, including the recently proposed 5, 6, and 7 bits/symbol 4D modulation format family based on 2A8PSK. A series of transmission simulation results show that this 2A8PSK family shows better nonlinear performance than the conventional modulation formats at each corresponding spectral efficiency, especially for dispersion-managed links, which are known to have high fiber nonlinearity. It is also determined that the primary benefits of the 4D constant modulus property comes from reduced effects of SPM and XPM. Since these modulation formats in the 4D-2A8PSK family differ just in the parity bits, they can be realized with very similar hardware over different spectral efficiency between DP-QPSK and DP-16QAM. Furthermore, this modulation format family can be the components of time-domain hybrid modulation, where almost arbitrary spectral efficiency can be realized between 4 and 8 bits/symbol, when combined with DP-QPAK and DP-16QAM.