Nonlinearity-Tolerant Modulation Formats for Coherent Optical Communications Nonlinearity-Tolerant Modulation Formats for Coherent Optical Communications

Fiber nonlinearity is the main factor limiting the transmission distance of coherent optical communications. We overview several modulation formats intrinsically tolerant to fiber nonlinearity. We recently proposed family of 4D modulation formats based on 2-ary amplitude 8-ary phase-shift keying (2A8PSK), covering the spectral efficiency of 5, 6, and 7 bits/4D symbol, which will be explained in detail in this chapter. These coded modulation formats fill the gap of spectral efficiency between DP-QPSK and DP-16QAM, showing superb performance both in linear and nonlinear regimes. Since these modula- tion formats share the same constellation and use different parity bit expressions only, digital signal processing can accommodate those multiple modulation formats with minimum additional complexity. Nonlinear transmission simulations indicate that these modulation formats outperform the conventional formats at each spectral efficiency. We also review DSP algorithms and experimental results. Their application to time-domain hybrid modulation for 4 – 8 bits/4D symbol is also reviewed. Furthermore, an overview of an eight-dimensional 2A8PSK-based modulation format based on a Grassmann code is also given. All these results indicate that the 4D-2A8PSK family show great promise of excellent linear and nonlinear performances in the spectral efficiency between 3.5 and 8 bits/4D symbol.


Introduction
Optical fiber nonlinearity is the main factor limiting the transmission distance of coherent optical communications in many cases [1][2][3]. This is an especially critical issue for low dispersion fiber case, since the waveforms stay in the original shape for a longer time and the nonlinearity effect is enhanced. In addition, in the dense wavelength multiplex division (DWDM) systems, multiple wavelength travels nearly at the same speed, so there is less opportunity of averaging out the nonlinear effects. Typical examples are legacy submarine cable systems [4].
At the same time, more service providers prefer flexible networks where adaptive transceivers can operate multiple data rates, modulation formats, and forward error correction (FEC) overheads for the efficient use of network capacity [5][6][7][8]. For example, it has been shown that the mean loss in throughput per transceiver is nearly proportional to the granularity of the data rate [8]. In order to accommodate a wide range of channel conditions, many modulation formats with various spectral efficiencies have been studied extensively [1,[9][10][11][12][13][14][15][16][17].
In order to mitigate or minimize the penalty from the nonlinear effects, efforts have been made to optimize the modulation formats, which are intrinsically tolerant to optical fiber nonlinearity [4,[18][19][20]. In other words, the modulation format is constructed such that they are less susceptible to fiber nonlinearity. In Section 2, we discuss the so-called X-constellation, which is an eight-dimensional (8D) code and has higher nonlinearity tolerance than dualpolarization (DP)-binary phase shift keying (BPSK) format of the same spectral efficiency of 2 bits/4D symbol. This significantly reduces cross polarization modulation (XPolM) and increases the transmission distance. Note that we use bits/4D symbol as a unit of spectral efficiency throughout this chapter.
Another method for reducing the fiber nonlinearity is 4D constant modulus modulation. The power of combined x-and y-polarization is constant at each time slot. This very effectively suppresses self-phase modulation (SPM) and cross-phase modulation (CPM). We discuss two earlier 4D constant modulus modulation formats [19,20] in Section 3. More recently proposed 4D-2A8PSK is another example of the 4D constant modulus modulation. It has been widely recognized that DP-Star-8QAM for 6 bits/symbol does not perform very well, and many formats have been investigated [21][22][23][24][25] for this spectral efficiency. In particular, 4D-2A8PSK with the spectral efficiency of 6 bits/symbol has been shown to have superior linear and nonlinear performance than many other formats, because of its large Euclidean distance, 4D constant modulus characteristics (i.e., constant power in each time slot), and Gray labeling [26][27][28]. By relating this to the block coding approach described in the context of highdimensional modulation [14], 5, 6, and 7 bits/symbol modulation formats were described as a family of block-coded 4D-2A8PSK in a unified form in [29,30]. In Section 4, we first describe the 4D-2A8PSK modulation format family for 5-7 bits/symbol spectral efficiencies. Transmission simulations in a nonlinear dispersion-managed (DM) link, as well as a dispersion-uncompensated link are performed, so that this proposed family of modulation formats can be confirmed to exhibit excellent linear and nonlinear transmission performances. An analysis on separated nonlinear components demonstrate that reduction of self-phase modulation (SPM) and cross-phase modulation (XPM) are the main causes of the improvement by the 4D constant modulus formats. We then review some items relevant to practical implementations in a digital signal processor (DSP), which need careful consideration since the constellation of the 2A8PSK family is different from that of the widely used QAM-based formats. Experimental verification, extension to TDH to seamlessly cover 4-8 bits/symbol spectral efficiency, as well as the combination of Grassmann code and 4D-2A8PSK for the spectral efficiency of 3.5 bits/symbol, will also be reviewed.

X-constellation
For coherent optical communications, block codes with 8-24 dimensions have been proposed to achieve coding gain compared to the conventional 2D modulation formats [14,[31][32][33][34][35]. For example, using eight bit block code (four information bits and four parity bits) achieved almost 3 dB asymptotic gain. However, they are not specifically designed for nonlinearity-tolerant modulation. Shiner et al. [4] used the 8D code to achieve the coding gain and also arranged the code such that the degree of polarization (DOP) over a symbol (two time slots) becomes zero, as shown in Table 1. This significantly reduces the impact of polarization change toward other channels, as well as receiving polarization effect from other channels. The authors call this modulation format "X-constellation," since the constellation of Slot-A and Slot-B are crosspolarized.
The authors conducted a transmission experiment using 5000 km of dispersion-managed high density wavelength division multiplexing (WDM) link and compared the Q-factor of DP-BPSK and X-constellation. Figure 1 shows that X-constellation showed 2 dB improvement in the Q-factor demonstrating the benefit of the X-constellation. Table 1. The optical field Jones vectors for the two consecutive time slots (Slot-A and Slot-B) that define the eight-dimensional X-constellation symbols, and their corresponding binary symbol labels [4].

4D constant modulus formats
DP-QPSK is a very robust modulation format for nonlinear transmission, and one of the main reasons is that it is 2D constant modulus, in that the power for each polarization is constant. 2D constant modulus property can also be achieved in DP-8PSK, DP-16PSK, etc.; however, their Euclidean distance is smaller than that of DP-Star-8PSK and DP-16QAM, and overall transmission characteristics are not as good. Instead, 4D constant modulus modulation formats (i.e., power of the combined X-and Y-polarizations is constant) were proposed. One of the 4D constant modulus format is 8PolSK-QPSK [19], in which eight polarization states in the Stokes space representation carry four different absolute phases, as shown in Figure 2. This gives 32 code words or 5 bits/symbol spectral efficiency. Compared to DP-Star-8QAM, 8PolSK-QPSK showed significantly reduced SPM and XPM.
Another example of 4D constant modulus format is POL-QAM 6-4, where six polarization states carry four different absolute phases [20]. This gives 24 code words.
In the next section, we review a family of 4D-2A8PSK modulation formats, which are also 4D constant modulus formats, covering multiple spectral efficiencies and having large coding gain.

Generalized mutual information (GMI)
As a first step, we give an overview of a metric in order to compare the modulation formats under the most relevant condition. Pre-FEC bit error ratio (BER) has traditionally been used to predict post-FEC BER performance of hard decision (HD) FEC systems. However, pre-FEC BER is not directly applicable to modern long distance fiber-optic communications using soft decision (SD) FEC based on bit-interleaved coded modulation (BICM). As an alternative performance metric more suitable for SD-FEC systems, the BICM limit, called generalized mutual information (GMI), was introduced to the optical communications research community for comparing different modulation formats [36,37]. This metric has been used to compare several modulation formats [23,38]. The normalized GMI (i.e., GMI per bit) can be described from the log-likelihood ratio (LLR) outputs of the demodulator at the receiver as follows [39][40][41]: where b, L, and E Á ½ denote the transmitted bit b ∈ 0; 1 f g, the corresponding LLR, and an expectation (i.e., ensemble average over all LLR outputs L and transmitted bits b), respectively. We define "normalized" GMI as the mutual information per modulation (information) bit, not per modulation symbol. The normalized GMI can therefore set the upper limit of the possible code rate of SD-FEC coding for BICM systems. Therefore, multiplying the normalized GMI with the number of bits per symbol is equivalent to the achievable throughput per symbol. The relationship between Q-factor calculated from pre-FEC BER and normalized GMI of four different modulation formats (DP-QPSK, D P-Star-8QAM, 6b4D-2A8PSK, and DP-16QAM) is shown in Figure 3. We will give a detailed explanation of the 6b4D-2A8PSK modulation format in Section 4.4. Here, the Q-factor is defined by which is a classical measure to calculate the required signal-to-noise ratio (SNR) to achieve the BER for binary-input additive white Gaussian noise (AWGN) channels. Here, erfc À1 Á ð Þ is an inverse complementary error function. Figure 3 shows that the same pre-FEC BER (Q-factor) does not necessarily give the same BICM limit among various formats, especially at lower code rate regions. When the normalized GMI is 0.85, the Q-factor lies between 4.77 (BER = 4.16 Â 10 À2 ) and 4.86 dB (BER = 4.01 Â 10 À2 ), corresponding to the typical Q 2 BER threshold of the stateof-the-art SD-FEC having a code rate of 0.8 [42,43]. Accordingly, we will use 0.85 as the target of the normalized GMI throughout this chapter.

Generic 2A8PSK
The generic constellation of 4D-2A8PSK [26][27][28][29] is shown in Figure 4. It is similar to 8PSK, with two different amplitudes represented by the radii, r 1 and r 2 (suppose r 1 ≤ r 2 without loss of generality). By combining the two polarizations (i.e., 4D space), 2 8 = 256 combinations (i.e., 8 bits per 4D symbol) are possible. With a condition that X-and Y-polarizations have complimentary radius, that is, if r 1 is used for X-polarization, then r 2 needs to be chosen for Y-polarization, and vice versa; we generate set-partitioned (SP) 4D codes achieving the property of 4D constant modulus, which leads to excellent nonlinear transmission performances. We define r 1 =r 2 ⩽ 1 ð Þas a ring ratio. When the ring ratio is equal to 1, the modulation format is reduced to regular DP-8PSK. The mapping rule of 4D-2A8PSK is also included in Figure 4 [44]. Let B 0 ½ , …, B 7 ½ express eight modulation bits, and B 0 ½ -B 2 ½ and B 3 ½ -B 5 ½ denote the Gray-mapped 8PSK at X-and Ypolarizations, respectively. Whereas, B 6 ½ and B 7 ½ are used to express the amplitude in each polarization. By selecting the optimum 32, 64, and 128 point constellations out of 256 combinations, we can construct 32SP-, 64SP-, and 128SP-2A8PSK, for the spectral efficiency of 5, 6, and 7 bits/symbol, respectively. We also call these 5b4D-, 6b4D-, and 7b4D-2A8PSK for convenience.

5b4D-2A8PSK
For the spectral efficiency of 5 bits/symbol, 32SP-2A8PSK (5b4D-2A8PSK) can be expressed by a linear code, with five information bits B 0 ½ -B 4 ½ , and three parity bits ½ can independently be represented as the linear combination of the five information bits, 2 10 = 1024 is the total number of possible linear codes to be designed. We chose the combination, which gives the least required SNR for the target GMI of 0.85, through Monte-Carlo simulations in AWGN.
In order to maintain a 4D constant modulus property, for each code word, the X-polarization ring size is always complementary to the Y-polarization ring size. Negating another parity bit B 6 ½ for B 7 ½ achieves this. As a whole, the parity-check equations for 5b4D-2A8PSK can be described as: where ⊕ and Á ½ denote the modulo-2 addition and negation, respectively.

6b4D-2A8PSK
In 64SP-2A8PSK (6b4D-2A8PSK), which has 6 bits/symbol spectral efficiency, B 6 ½ is a parity bit of single-parity-check code protecting all the information bits, and can be expressed as an exclusive-or (XOR) of all the information bits B 0 ½ -B 5 ½ . Another parity bit B 7 ½ is the negation of B 6 ½ as used in 5b4D-2A8PSK. The optimum code for the target GMI of 0.85 is For the spectral efficiency of 7 bits/symbol, 128SP-2A8PSK (7b4D-2A8PSK) can be constructed simply as follows. In this code, B 0 ½ -B 6 ½ are the information bits while there is only one parity bit at B 7 ½ . In order to realize 4D constant modulus format, just like 5b4D-and 6b4D-2A8PSK, we can express the single parity bit B 7 ½ as:

Other modulation formats for comparison
In order to evaluate the performance of 5b4D-2A8PSK, we consider three other modulation formats having 5 bits/symbol spectral efficiency, that is, 8PolSK-QPSK [19], 32SP-16QAM [11], and time-domain hybrid (TDH) modulation. 8PolSK-QPSK [19] was briefly explained in Section 3. 32SP-16QAM is a 4D set-partitioned modulation format derived from DP-16QAM. To generate 32 code words, the parity rule shown in [11] is used. TDH modulation using a 1:1 mixture of DP-QPSK and 6b4D-2A8PSK to achieve an average of 5 bits/symbol spectral efficiency is also included.
To compare with 7b4D-2A8PSK, two modulation formats of 7 bits/symbol spectral efficiency are evaluated. 128SP-16QAM is a 4D modulation format based on DP-16QAM, where 128 code words are generated using the parity rule described in [11]. We also included TDH modulation using 1:1 mixture of 6b4D-2A8PSK and DP-16QAM. Furthermore, we included DP-16QAM; however, to compare for the same data rate, we used the Baud rate of (7/8) Â 34 GBd.

Simulation procedure
Nonlinear transmission simulations are conducted over a 2000 km DM link at a rate of 34 GBd per channel to evaluate the effect of modulation format on high fiber nonlinearity. At the transmitter, pulses were filtered by a root-raised-cosine (RRC) filter with a roll-off factor of 10%. Eleven DWDM channels of the same modulation format were combined with 37.5 GHz spacing without using any optical filtering. The link consists of 25 spans of 80 km nonzero dispersion shifted fiber (NZDSF) in which loss is compensated by Erbium-doped fiber amplifiers (EDFAs). The performance of each modulation format can be quantified by a span loss budget, which is defined as [45] Span Loss Budget ¼ 58 þ P À ROSNR À10 log 10 N ð Þ À NF, where P is the launch power per channel expressed in dBm, ROSNR is the required OSNR to achieve the target GMI in dB, N is the number of spans, and NF is the noise figure of the EDFAs in dB.
The parameters for NZDSF were γ ¼ 1:6 /W/km, D ¼ 3:9 ps/nm/km, and α ¼ 0:2 dB/km. Other fiber effects such as polarization mode dispersion (PMD) and dispersion slope were not included. At the end of each span, 90% of the chromatic dispersion was compensated as a lumped linear dispersion compensator. Dispersion pre-compensation was applied at the transmitter side using 50% of the residual dispersion of the full link. The rest of the dispersion is compensated just before the receiver.
An ideal homodyne coherent receiver was used, with an RRC filter with a roll-off factor of 10%, followed by sampling at twice the symbol rate. For adaptive equalization, we used a time-domain data-aided least-mean-square equalizer utilizing the transmitted data directly as the training sequences for simplicity. A discussion on a more realistic equalizer will be given in Section 4.9. We did not use carrier phase estimation (CPE) in Sections 4.7 and 4.8.
All the optical noise due to the EDFAs are loaded just before the receiver. The calculated required OSNR at the target GMI is used to obtain the span loss budget as in (9). We used an EDFA noise figure of 5 dB to calculate the span loss budget.

5 bits/symbol modulation formats
Four 5 bit/symbol formats are compared as shown in Figure 5. In this case, we use the ring ratio of r 1 =r 2 ¼ 0:61, optimal for 5b4D-2A8PSK for maximizing the span loss budget. Note that  ring ratio is not a sensitive parameter, and in fact between 0.56 and 0.66, the peak span loss budget changed only by 0.03 dB.
As the launch power increases, the span loss budget for 32SP-16QAM decreases fast, because of large power variations at each time slot. On the other hand, 8PolSK-QPSK [19] has 0.65 dB worse OSNR for the linear case, while the saturation characteristics are very similar to 5b4D-2A8PSK due to constant power. TDH modulation with a 1:1 mixture of DP-QPSK and 6b4D-2A8PSK has 4D constant modulus property at each time slot. However, we used an optimized power allocation for TDH modulation (i.e., 6b4D-2A8PSK has 2.7 dB higher power than DP-QPSK), and there is a power variation between time slots generating some penalty due to the nonlinearity.
Overall, 5b4D-2A8PSK has the higher maximum span loss budget by 0.5 dB over the TDH modulation, by 0.9 dB over 8PolSK-QPSK, and by 1.8 dB over 32SP-16QAM.

Summary for the dispersion managed link results
The peak span loss budget for the DM link is summarized in Figure 8. The circles connected by the dashed lines include DP-QPSK, 5b4D-, 6b4D-, 7b4D-2A8PSK, and DP-16QAM, all at 34 GBd. Squares are taken from TDH modulation formats, and triangles are from other (conventional) modulation formats in Figures 5-7. This shows that the 4D-2A8PSK family fills the gap between DP-QPSK and DP-16QAM almost linearly (in the dB scale), and each one offers a good improvement from the conventional modulation formats at the same spectral efficiency.

5 bits/symbol under dispersion uncompensated link
For evaluating the transmission characteristics of various modulation formats under a reduced nonlinearity situation, representing terrestrial cases, we also simulated the link with 50 spans of 80 km standard single-mode fiber (SSMF) without inline dispersion compensation or dispersion pre-compensation. SSMF parameters are γ ¼ 1:2 /W/km, D ¼ 17 ps/nm/km, α ¼ 0:2 dB/km. We used the same 0.85 as the target GMI. The span loss budget of the four modulation formats for the spectral efficiency of 5 bits/symbol are shown in Figure 9, as an example. The overall differences among the modulation formats are smaller than the case of DM-NZDSF link. 5b4D-2A8PSK shows the highest performance with the peak span loss budget outperforming those of TDH of DP-QPSK and 6b4D-2A8PSK, 8PolSK, and 32SP-8QAM by 0.2, 1.0, and 0.8 dB, respectively. TDH and 32SP-16QAM in the dispersion uncompensated link case did not suffer as much as they did in the DM case. The reason is the weaker nonlinear distortion in the uncompensated SSMF links compared to DM-NZDSF links.

Separated nonlinearity
For better understanding, the reason of the outperformance of 4D constant modulus modulation, additional simulations are conducted with separated nonlinear components, using the method proposed in [46]. With this method, the nonlinear transmission performance with nonlinear effects of SPM, XPM, and XPolM can be evaluated individually. Figure 10 shows the simulated Q-factor as a function of OSNR for 6b4D-2A8PSK and DP-Star-8QAM in the DM link, where the simulation parameters are kept the same as in Section 4.7.1.
Here, we use the recently proposed Q-factor definition based on GMI and not on pre-FEC BER as follows [44]: Figure 9. Span loss budget of four 5 bits/symbol modulation formats as a function of launch power for the dispersion unmanaged link [30].
where J À1 Á ð Þ is the inverse J function, widely used in extrinsic information transfer chart analysis [39]. The inverse J function is well approximated by The Q-factor defines the above based on GMI in Eq. (10) is a generalized extension from the conventional Q-factor based on BER in (2). With this new Q-factor, the effective SNR to achieve same post-FEC BER performance with SD-FEC systems can be evaluated. Even though both definitions provide identical Q performance in binary-input AWGN channels, the generalized Q-factor can predict SNR gain more accurately than the conventional BER-based Q-factor for BICM systems using SD-FEC coding and/or high-order high-dimensional modulation.
The curves marked with "AWGN" in Figure 10 indicate the case in which the nonlinear effects are fully ignored, and the curves depicted with "SPM," "XPM," and "XPolM" show that these nonlinear components are individually added. The curve with "SPM + XPM + XPolM" shows the situation when all of these nonlinear effects are taken into account. The launch power is set to be À4 dBm, giving the peak span loss budget for DP-Star-8QAM. At this launch power, OSNR of 15.2 dB gives a normalized GMI of 0.85. Figure 11 is a re-plot of the simulated Q versus separated nonlinear effects when the OSNR is 15.2 dB. Q under the linear condition (AWGN) for 6b4D-2A8PSK is higher than DP-Star-8QAM by 0.4 dB. The contributions from SPM and XPM in 6b4D-2A8PSK are much smaller than those in DP-Star-8QAM. This confirms that 4D-2A8PSK family can be robust against XPM and SPM nonlinearity. However, the contribution of XPolM is similar in 6b4D-2A8PSK and DP-Star-8QAM. This is due to the fact that individual polarization power in 6b4D-2A8PSK fluctuates over symbol time even though the combined power of both polarizations is constant. This is consistent with a report in which another 4D constant modulus modulation format 8PolSK shows a significant reduction in SPM and XPM, but not necessarily in XPolM [19].

Adaptive equalizer
To understand the fundamental benefit of the 4D-2A8PSK family, we used an idealized datadirected least-mean-square equalizer up to this point. In this section, we address the performance impact when more realistic equalizers [47,48] are used, considering practical implementations into account.
We first consider a conventional radius-directed equalizer (RDE) [48] for 6b4D-2A8PSK, where the decision on the ring radii is performed at each polarization separately. In this case, we observe 0.12 and 0.10 dB degradation in the span loss budget, in comparison to the idealized least-mean-square equalizer at a launch power of À10 and À4 dBm, respectively.
We then take advantage of the 4D constant modulus property, by using the relative power of two polarizations for soft decision of the ring radii. For soft decision information, we use a heuristic sigmoid function S x ð Þ ¼ 1= 1 þ e Àx=a À Á , where a is a softness parameter, and x is a relative power of two polarizations. In this manner, we can compensate for the degradation by 0.07 dB from the conventional RDE. The overall degradation due to the realistic adaptive equalizer compared to the ideal one is no worse than 0.05 dB.

LLR computation
For SD-FEC, it is necessary to calculate log-likelihood ratio (LLR) with moderate circuit complexity. A fast-decoding algorithm and LLR computation for high-order set-partitioned 4D-QAM formats [49] is now extended to 6b4D-2A8PSK to use two lookup tables [44]. The schematic of the soft-demapping circuit is shown in Figure 12, and also used the asymmetry between the radial and the axial LLR and the offline processing of the experimental data showed only a small power penalty [44]. It also used the LLR calculation method robust against residual phase noise [50].

Experiment
We have also conducted a transmission experiment comparing 6b4D-2A8PSK and DP-Star-8QAM [44]. The signals were either 6b4D-2A8PSK or DP-Star-8QAM modulated at 32 GBd and filtered with a root-raised cosine filter with a roll-off factor of 0.15. Seventy channels were spaced at 50 GHz spacing. The transmission line was 1260 km, having an average span length of 70 km. Chromatic dispersion was managed inline by the mixture of nonzero dispersion shifted fiber (NZDSF) having negative local CD of À3 ps/nm and standard single-mode fiber (SSMF). In the receiver side, the signal stored by 64 GS/s analog-to-digital converters (ADCs) was processed offline, which included CD compensation, adaptive equalization with constant modulus algorithm for initial convergence, and radius directed equalization afterward, carrier recovery (CR) with multipilot algorithm [47] having an window size of 63, pilot-aided phaseslip recovery, and the proposed soft-demapping as described in Section 4-9.   The overall performance gain of 0.5 dB was still significant in the highly nonlinear transmissions. Figure 14(b) shows required OSNR, which was calculated by loading noise at the receiver DSP to emulate OSNR decrease. The target normalized GMI was set to 0.92, which was close to 20.5% SD-FEC limit [53]. The proposed soft-demapping worked even at such low OSNR conditions and 4D-2A8PSK outperformed DP-Star-8QAM as the launched power increase.

Time domain hybrid modulation
TDH modulation has been studied considerably to cover a wide range of channel conditions, due to its flexibility in choosing the nearly arbitrary spectral efficiency [12,51,52]. As the constituent modulation formats, we use DP-QPSK (4 bits/symbol) and QP-16QAM (8 bits/ symbol) in conjunction with 5b4D, 6b4D, and 7b4D-2A8PSK to widen the range of TDH [54]. For a comparison, we also use TDH modulation using conventional modulation formats, that is, DP-QPSK, 32SP-QAM, DP-Star-8QAM (S8QAM), 128SP-QAM, and DP-16QAM. The benefit of the 4D-2A8PSK family is the 4D constant modulus property. In other words, there is no Figure 13. Experimental setup [44]. Figure 14. Experimental result of (a) Q from GMI and (b) required OSNR for two types of LLR calculation: Ideal (dotted line) and the proposed in Figure 12 (solid line) [44].  compromise in choosing the power ratio (ratio between the two modulation formats). On the other hand, conventional formats experience power fluctuations, causing compromise in the power ratio [29,54].
We simulated transmission performance over the same link condition as described in Section 4.7. For 5b4D, 6b4D, and 7b4D-2A8PSK formats, we choose the ring ratio of 0.60, 0.65, and 0.59 for the best nonlinear performance. For all the THD modulation, we use 1:1 ratio with alternating formats; however, in actual systems, any arbitrary ratio can be used. The important parameter for TDH is the power ratio, that is, how much power will be allocated for each time slot. We optimize the power ratio for the best nonlinear performance. Figure 15 shows the calculated span loss budget for 4-6 bits/symbol modulation formats, including the 2A8PSK-based and the conventional TDH modulation. DP-QPSK and 6b4D-2A8PSK data are also included as a reference. Figure 16 shows the span loss budget for 6.5-8 bits/symbol modulation formats. From these figures, we can see that the TDH modulation based on 2A8PSK has much better nonlinear performance than that based on the conventional modulation formats, due to their constant modulus property.

3.5 bits/symbol modulation format
Grassmann code [4,55] is known to be robust against state of polarization (SOP) rotation including cross polarization modulation (XPolM) as described in Section 2. We investigated a Grassmann code-based 7-bit 8D code [56], whose schematic is shown in Figure 18. 2-ary amplitude QPSK (2AQPSK) and 2A8PSK are used for the first and the second time slots, respectively. x 1 , x 2 , y 1 , y 2 are x-and y-polarization component of the first and second time slot, respectively. Let b 0 -b 6 be the information bits. In a similar manner as described in Sec 4.7, for the 2AQPSK part, b 0 and b 1 are used for the angle of x 1 , and b 2 , b 2 are used for the angle of y 1 , respectively. For x 2 , b 4 -b 6 are used for the angle representation of 2A8PSK. All use Gray coding for the angle. The radius of x 1 is expressed as XOR b 4 ; b 5 ; b 6 ð Þ , where "0" means the larger radius and "1" means the smaller radius. The ratio of the radii is called the ring ratio. The radius of y 1 is expressed as XOR b 4 ; b 5 ; b 6 ð Þ . Both 2AQPSK and 2A8PSK share the same ring ratio of 0.70, which was optimized for the nonlinear performance. The radius of x 2 is expressed as XOR b 0 ; b 1 ; …; b 6 ð Þ . y 2 is calculated from the Grassmannian condition x 1 y * 1 þ x 2 y * 2 ¼ 0. This guarantees the 4D constant modulus condition for both time slots. Figure 17. Span loss budget for TDH modulation based on the 4D-2A8PSK formats, and that on the conventional modulation formats [54].  We compared 7b8D-2A8PSK (3.5 bits/symbol), PS-QPSK (3bits/symbol), and DP-QPSK (4 bits/ symbol) of the same data rate. We also chose the channel spacing as 1.15 times of the Baud rate. Simulation procedures and parameters are nearly identical to that described in Section 4-71, except that we used nine channels. The simulated results are shown in Figure 19. 7b8D-Grassmann format exhibits almost the same span loss budget as PS-QPSK, which has higher Baud rate and broader spectrum. On the other hand, 7b8D-Grassmann format shows much larger span loss than DP-QPSK, although the latter has narrower spectrum. Therefore, depending on the application, 7b8D-Grassmann format may be an alternative to PS-QPSK and DP-QPSK.

Conclusion
We reviewed nonlinearity-tolerant modulation formats, including the recently proposed 5, 6, and 7 bits/symbol 4D modulation format family based on 2A8PSK. A series of transmission simulation results show that this 2A8PSK family shows better nonlinear performance than the conventional modulation formats at each corresponding spectral efficiency, especially for dispersion-managed links, which are known to have high fiber nonlinearity. It is also determined that the primary benefits of the 4D constant modulus property comes from reduced effects of SPM and XPM. Since these modulation formats in the 4D-2A8PSK family differ just in the parity bits, they can be realized with very similar hardware over different spectral efficiency between DP-QPSK and DP-16QAM. Furthermore, this modulation format family can be the components of time-domain hybrid modulation, where almost arbitrary spectral efficiency can be realized between 4 and 8 bits/symbol, when combined with DP-QPAK and DP-16QAM. Figure 19. Span loss budget of three modulation for the same data rate, as a function of launch power for the target normalized GMI = 0.85.