Parasitic Capacitances on Scaling Lateral Nanowire Parasitic Capacitances on Scaling Lateral Nanowire

The gate-all-around silicon nanowire transistor (GAA-NW) has manifested itself as one of the most fortunate candidates for advanced node integrated circuits (ICs). As the GAA transistor has stronger gate control, better scalability, as well as improved transport properties, the device has been considered as a potential alternative for scaling beyond FinFET. In recent publications, the basic feature and scalability of nanowire have been widely explored primarily focusing on intrinsic device characteristics. Although the GAA-NW has superior gate control compared to other architectures, the device is sur - rounded by huge vertical gate metal line and S/D contact metal lines. The presence of this vast metal line forms a strong parasitic capacitance. While scaling down sub-7 nm node dimensions, these capacitances influence strongly on the overall device performances. In this chapter, we have discussed the effects of various parasitic capacitances on scaling the device dimensions as well as their performances at high-frequency operations. TCAD-based compact model was used to study the impacts of scaling GAA-NW’s dimensions on power performance and area gain perspective (PPA).


Introduction
Since the beginning of solid-state technology, a continuous reduction of transistor size has delivered the smaller and faster electronic technology in every new generation. Since the last few decades, Moore's law [1] was gifted with many prime movers, such as the mobility boosted by strained silicon [2] and reduced gate leakage using the high-k/metal gate [3] and most importantly, planar MOSFET was replaced by the FinFET to achieve better leakage control [4]. Since then, the major industries followed FinFET technology for their 16/14 nm and 10 nm node, respectively [5][6][7]. However, while scaling down below the 10 nm node, again the short channel effects (SCE) such as subthreshold leakage rise significantly and became a major concern in scaling FinFET architecture. As per international technology roadmap for semiconductor (ITRS 2015) prediction, the contacted gate pitch (CGP) in 7 nm node transistor would be ~42 nm, making a gate length of less than 15 nm [8]. Although the FinFET has gate wrapping around the channel, at these shorter dimensions, a much stronger gate control is required. Therefore, gate-all-around (GAA) architecture has emerged as an alternative to FinFET [9]. The gate in NW covered all over around silicon channel providing a stronger control, therefore preventing more unwanted leakages. However, reduction of effective width (W eff ) in NW reduces the current driving capacity significantly. Although the drive current in NW can be increased by stacking multiple wires per fin, however, a taller fin device increases parasitic capacitances which may limit the benefit of scaling [10]. Though numerous studies have been carried out for analyzing the intrinsic and parasitic capacitance [11][12][13][14], however, there is still a requirement for an extensive analysis to model the GAA-NW's major parasitic components. Thus, this chapter deals with the capacitance model of a GAA-NW transistor as well as the overall scaling performances at ring oscillator circuits.

GAA-NW device
To continue Moore's law, transistor sizes are scaled down to the 7 nm node (N7) and 5 nm node (N5) specifications [15,16]. A contacted gate pitch (CGP) of 42 and 32 nm were used in both the N7 and N5 devices. Gate length (L g ) of 14 and 10 nm with a wire diameter of 7 (D7) and 5 nm (D5) was considered in all the N7 and N5 specifications [10]. Channel material for n-channel and p-channel GAA-NW was considered with Si and Si 50 Ge 50 , respectively. An epitaxial-shaped source (S) and drain (D) regions were used as contacts. The S/D regions were doped with an active doping concentration of 3 × 10 20 /cm 3 for both n-channel and p-channel devices with a specific contact resistivity of 5 × 10 −9 Ωcm 2 . Gate dielectric of 0.5 nm oxide and 1.5 nm high-k (HfO 2 ) layers was used. A gate spacer of 5 nm thickness was applied to both the gate-to-source and gate-to-drain regions (relative permittivity, ε r = 4.4). A midpoint work function value for metal gate was used in both the devices during initial simulation. Considering a fixed NW pitch (NWP) of 14 and 10 nm offset, the two-stacked-NW device creates a fin height of 31 and 29 nm, respectively. Other specifications and setup for each device were considered similar to the reference [10,15]. The simulated compact NW is shown in Figure 1(a) and a cross-sectional view showing inside details in Figure 1(b).

Simulation methodology
To estimate various parasitic capacitances in a GAA-NW and their impacts on overall scaling performances at a higher frequency, a TCAD-based compact model study was performed. Figure 2 shows a flow diagram used to analyze the DC and AC performances of the scaled GAA-NW transistor.
A two-stacked GAA-NW transistor was implemented using the Sprocess module of TCAD tool Sentaurus [17]. Given a continuous channel in FinFET, the selective region was etched away to form a round-shaped NW channel [18]. The 7 nm width-based FinFET transformed into the 7 nm diameter-based GAA-NW. Then the diameter and the gate length of NW were scaled down further to 5 and 10 nm, respectively [10]. All the parameters for device simulation were considered similar to the default 7 nm FinFET model [17].

Subthreshold current estimation
Electrical performances were estimated separately for both the off-state and for the on-state conditions. The Sdevice [19] simulation with Shockley-Read-Hall (SRH), auger, band-to-band tunneling (BTBT) recombination, bandgap narrowing, anisotropic density gradient, interface charge, mobility model with multivalley correction, thin inversion layer correction with highk dielectric and quantum correction for the inversion layer were used to obtain the properties such as subthreshold slope (SS) and drain-induced barrier lowering (DIBL). Then, these  intrinsic properties (SS and DIBL) for different gate lengths, wire diameters and vertical pitches were fitted into the BSIM-CMG model [20]. Thus the fitted model provided subthreshold characteristics of a NW used for circuit-level simulations.

The drive current estimation
To estimate the on-state drive current, the ballistic flow has been considered along with drift-diffusion currents. As the channel length in N7 and N5 devices was scaled down to the carrier's mean free path range [21], the total drive current in NW was considered to be quasi-ballistic in nature [21]. The scattering-free ideal ballistic current is defined by Eq. (1), where q is the electronic charge, V inj is the carrier's injection velocity and N inv is the number of inversion charges. Sband [22] simulation provided these pure ballistic current characteristics for the different wire diameter and applied voltages The quasi-ballistic current is represented by Eq. (2). Quasi-ballistic current is the product of pure ballistic current times the ballistic ratio. The ballistic ratio (BR) is the ratio between actual saturation current and ideal ballistic currents.
The ballistic current is independent of gate lengths, but BR is strongly dependent on the gate length, wire diameter and channel stress [21]. The BR for D7 NW was assumed to be similar to a 7 nm width-based FinFET although the NW might have lower BR than FinFET; since it strongly depends on the body configuration of the device, this possible small error on BR was further screened by electrostatics and access resistance [10]. The BR for D5 device (Figure 3) was extrapolated because it is expected to have significantly lower BR than the D7, similar to [10]. The variations of BR with the applied channel stress and channel length reduction for   [23][24][25][26].
Next, for both the D5 and D7 devices, Sband simulated results such as carrier's injection velocity (V inj ) and the number of inversion charges (N inv ) were multiplied with electronic charge (q) to obtain the I Bal , as plotted in Figure 5(a) [10].
The variations of pure ballistic current (I Bal ) with the gate voltages are plotted in Figure 5(a) and after multiplying with BR, the obtained I quasi-ball is shown in Figure 5(b). Then, these quasiballistic characteristics were fitted into the BSIM-CMG model by fitting the mobility and carrier's velocity equations similar to [10]. As well as the intrinsic capacitances, interface trap and other device properties for each NW were fitted into the BSIM-CMG model.
The final drive current was obtained after including the front end of the line/mid of the line (FEOL/MOL) R&C parasitics into the BSIM-CMG model file [20]. Then, SPICE simulation was performed with this fitted model to obtain high-frequency properties for a 15th-stage ring  oscillator (RO) inverter circuits [27]. The ring oscillator setup is discussed in Section 6. All the simulations were performed at a fixed saturation drive voltage of 0.65 V and targeted off-state current (I off ) of 3.5 nA per fin.

Parasitic capacitance estimation
In this section, various parasitic capacitances associated with the metal gate of the NW transistors are discussed. We start with a basic fin channel architecture and then the formation of vertical nanosheet and then finally a GAA-NW with a higher vertical pitch. The vertical stacking of GAA-NW is essential for higher drive currents, but then again, the area associated with the wrapping metal lines is also increased as the fin height is increased, thus by doing so expected stronger parasitic capacitances.

Vertical pitch variation (fin to NW)
The transition from fin channel to GAA channel enables much stronger gate control. The bottom region of the channel fin was released and covered with the gate dielectric layers, as shown in Figure 6(I), as a result of making the channel gate-all-around vertical nanosheet. Next, the single nanosheet was gradually transformed into two isolated NWs. Subsequently the vertical pitch between two NWs was increased from 7 to 19 nm (Figure 6(III)-(IX)). The transition from a single NS to multiple NW improves the subthreshold slope (SS) characteristic due to a stronger control in all-around architecture. The variations of subthreshold slope (SS) for all these structures have been plotted in Figure 7 Figure 7(b). Thus it can be concluded that up to a certain fin height with higher vertical pitch, NW stacking improves SS compared to a continuous fin channel; however, SS benefit may be limited by the lowered active channel area along with the increased parasitic capacitances.

The capacitance model
To analyze the overall parasitic capacitances on increasing the fin height, a schematic model depicting gate and source contact lines of a NW is shown in Figure 8. This model represents half of the NW architecture as a simplified model for calculating major capacitances associated with gate-to-source sidewall only. Similarly, the gate-to-drain sidewall capacitances can be calculated by reflecting this model as gate and drain contact sidewalls. At this point, the model represents four different major capacitances Cp1, Cp2, Cp3 and Cp4. The capacitance Cp1 represents the   wire fringe capacitances between the gate sidewall and all the nanowire surfaces. The spacer between the source contact and gate contact line was considered to be (T spacer ) 5 nm; this narrow spacing forms a strong parallel palate capacitance between gate and source. Therefore, the Cp2 represents the major parasitic capacitance between source sidewall and gate sidewall. The Cp2 is expected strongly dependent on NW's vertical pitch. The Cp3 is a fringe capacitance between the top gate surface and the source contact sidewall. And, Cp4 is an overlap (around 2 nm) capacitance between the nanowire and gate metal line. As we have seen previously, achieving a better electrostatic control with higher drive performance, the multiple stacking with higher vertical pitches are essential. However, both of these requirements increase the fin height as well as the Cp2. These two conflicting points need to be carefully optimized in both the process variation and for the best electrical performances. Among these capacitances, the Cp2 has a major contribution on overall device performance; thus, a detailed calculation methodology of Cp2 is presented here.

Parallel plate capacitance (Cp2)
In calculating the values of Cp2, a conventional parallel plate capacitance formula (εA/d) was used, wherein d and ε are the spacer width and permittivity, respectively. However, the active area (A) was calculated by considering a special development. Area (A) is the average of AREAsource and AREAgate wherein the AREAsource is the cross-marked source sidewall region after excluding the wire diameters as shown in Figure 9. And AREAgate is the cross-marked gate sidewall regions after excluding wire regions as well as by drawing a new perimeter related to the spacer width (T spacer = 5 nm). This is mainly to exclude the contributions of continuous electric flux shared by two parallel plates through the connected NWs. The NW with two different wire diameters (7 nm = D7 and 5 nm = D5) was varied for different vertical pitches. Values of all the capacitances are calculated and presented in the next section.

The capacitance values
The comprehensive model formula presented in papers [11,12] is used to calculate all the parasitic capacitances. Considering the initial values from [12] the capacitances, Cp1, Cp2, Cp3 and Cp4 are plotted for all the architectures presented in Figure 6.  capacitances, Cp2 witnessed the most significant parasitics which is strongly dependent on fin height (vertical pitch). The value of Cp2 in Figure 6(IX) experienced a 22% higher capacitance than in Figure 6(I).

Ring oscillator simulation
To benchmark the power performance and area gain for both the 7 nm diameter (D7)-and the 5 nm diameter (D5)-based GAA-NW, ring oscillator-level circuit simulations were performed [10]. Considering an inverter-based ring oscillator (RO), the RC delay and active power consumption as well as leakage power loss were calculated for each device. The ring oscillator (RO) with a 15-stage inverter was simulated using the SPICE simulation setup [27]. All the inverter stages had a fan out three and loaded with an optimum back end of the line (BEOL) load [28]. A median value of interconnect wire length for the critical path was loaded as the BEOL load including all parasitics in each inverter stage (50 CGP long) [28]. Then, a minimum delay optimization technique for a critical path was used to optimize the benefits of scaling D5 NW with various device configurations such as tighter vertical pitches and reduced gate lengths. The schematic of RO chain setup is shown in Figure 12(a).

Figure 12(b)
shows two-stacked D7 and D5 NW devices. Moving from D7 to D5 NW provides a 26% reduction in W eff as well as a 2 nm reduction in fin height (with an equal vertical pitch of 14 nm). Reducing the wire diameter from D7 to D5 offers only 2 mV/decade and 4 mV/V SS and DIBL improvements [10]. However, reduction in W eff delivers a significant reduction in overall dive current (~35%) [10]. This reduction in drive current might be improved by reducing the NW's gate length from 14 to 10 nm as well as by stacking more numbers of NW per fin with a reduced vertical pitch. On the other hand, stacking multiple NW increases parasitic capacitances along with the fin height. In this section, we have discussed the benefit of scaling on power and speed performance for both the 7 nm node (D7/D5 @N7) and the 5 nm node (D5@N5) GAA-NW. (3)

Power-delay optimization at 7 nm node (N7) dimensions
For all the 7 nm node (N7) specifications, a 42 nm CGP and 14 nm gate length were considered [10]. The dynamic power consumptions in an oscillator circuit were calculated based on the formula presented by Eq. (3), where C eff is the effective load capacitance, f is the operating frequency and V dd is the drive voltage. The RC delay was calculated in both the D7 and D5 devices based on the values obtained in [10].
The change in ring oscillator performances (RC delay and power loss) with the variation of V dd from 0.4 to 0.9 V is displayed in Figure 13. The optimum delay versus drive voltage variation has been presented in Figure 13(a). Figure 13(b) shows the active power loss for both the D7 and the D5 devices. Nearly a 20% rise in delay was observed for the D5 device compared to the D7 device, largely due to the reduction of drive currents. This reduced drive current (active area) in D5 device at the same applied voltage consumes less active power than the D7 device. However, the actual benefit can be visualized by plotting the power-delay product (energy) and leakage power loss for both the devices at a targeted frequency. Figure 14(a) shows a change in energy consumption with the variations of frequency for both the devices. While operating at the same frequency, a significant rise in energy consumption was observed for the D5 device in comparison to the D7 device. To achieve an equal drive current in the D5 device always require a higher drive voltage in comparison to the D7 device, therefore consuming more energy leading to overall degraded performances. Besides that, while operating at the same frequency, the D5 device consumes more leakage power than the D7 device (Figure 14(b)).

Power-delay optimization at 5 nm node (N5) dimensions
Although the electrostatics were improved marginally by reducing the nanowire's diameter, the overall performance was degraded at the same gate length. Therefore, in order to understand the overall benefit of scaling the wire diameter, we scaled the gate length from 14 to 10 nm. At the 10 nm gate length, the D5 showed a significant SS improvement compared to the D7 device (~10 mV/decade) [10]. A 10 nm gate length and 32 nm CGP were defined in D5 NW for the N5 specifications. Furthermore, the vertical pitch between two D5 NWs was reduced to 12 nm. To equalize the drive current, a four-stacked D5 NW at N5 specifications was compared with the two-stacked D7 NW at N7 specifications with increased BEOL load. In this case, the D5 shows an improved delay performance compared to the D7 device (Figure 15(a)). It should be noted that the D5 consumes more active power than the D7 device, as shown in Figure 15(b). These minimal improvements in delay and power were further normalized by the energy versus speed performances, as shown in Figure 16(a). Figure 16(b) shows the comparable leakage power characteristics for both the devices. Though, after considering taller fin in the D5 case, the power-delay product shows a comparable trend for low-frequency operation, however, it starts consuming more energy as it goes toward highfrequency operation. Now, with these NW configurations, a fin height in four-stacked D5 devices grows an additional 14 nm taller than a two-stacked D7 device adding an additional parasitic capacitance as expected.
This flexibility shows the various aspects of scaling the D5 NW at a reduced wire pitch along with a greater number of NW (NNW) stacking. The overall performance benefits for a D5 device at N5 specifications and D7 device at N7 specifications were benchmarked with the variations in fin height. In the next section, we discussed the energy versus frequency for different NW alignments.

Benchmarking D5 NW at N5 and D7 NW at N7
A three-stacked D5 device delivers nearly equivalent drive currents with a two-stacked D7 device (only 6% loss) [10]. Subsequently, using D5 NW the drive current can be increased by stacking a greater number of wires per fin as the D5 may provide an area benefit with a 32 nm CGP. On the other hand, stacking more numbers of D5 NW will increase the parasitic capacitance (mainly Cp2) in comparison to the rise in gate capacitances (FEOL). The fin height variation with different NW stacking is shown in Figure 17. When moving from two-stacked to five-stacked NW D5 NW device, a 91% rise in drive current has been observed (18.03-34.44 μA), but this increases the total fin height by 90% (from 29 to 55 nm). On the other hand, moving from two-stacked to three-stacked D7 NW device (14 nm wire pitch) improves the drive currents by 30.5% (25.95-33.87 μA), but the fin height is increased by 45% (from 31 to 45 nm) only. The three-stacked D7 NW delivers almost equal drive currents with the five-stacked D5 NW at much lower fin height (33.87 and 34.44 μA). Thus, after considering all the parasitic impacts, a fair comparison is needed to observe the actual benefit of scaling wire diameter (D5).
Optimizing the fin height in a NW transistor remains very sensitive to overall device design. Keeping this in mind, the total energy consumptions per device with increased fin height were plotted in Figure 18 for both the D5 and D7 devices with multiple stacking. the D7 and D5, a single-stacked NW to a five-stacked NW was considered. Along with the multiple stacking, three different vertical pitches (16, 14 and 12 nm) were varied for the D7 device and four different pitches (16, 14, 12 and 10 nm) for the D5 device. A continuous rise in energy consumptions was observed while going from single NW to multiple NW in both the D7 and D5 devices. The fin height has been reduced by reducing the NW vertical pitch, thus shrinking the energy loss. It was observed that the D7 NW always consumes higher energy than the D5 NW at specific fin height. However, with a lowered fin height, D5 NW delivers lower drive currents which further reduces the overall device speed. The variations of speed (frequency) with increasing the fin height for both D7 and D5 devices are plotted in Figure 19. At an identical fin height, the D7 device always delivers higher speed than the D5 device. Hence, considering both the D7 and D5 NW transistors, the fin height was increased by stacking single wire to five multiple wires. Until a three-stacked design for both the D7 and D5 NWs, a linear rise in frequency with fin height was observed. However, the frequency gets saturated with increasing fin height beyond the three-stacked NW design. Further rise in fin height will start degrading the frequency performance. This is mainly caused by the rise in parasitic capacitances. Generally, the capacitance loaded with a transistor is the sum of both intrinsic gate capacitance and FEOL parasitic capacitances. Increasing the fin height increases the total area of gate metal stack positioned next to a source and a drain contact lines. This effect remains hidden in both the D7 and D5 NWs up to a certain fin height. At the same fin height, the D7 drives have slightly higher speed in comparison to the D5 device as it has lower parasitic influence. Furthermore, increasing the fin height started degrading the total speed performance of D5 device quite early. Hence multiple stacking is also a major limiter for providing a higher speed.
Along with the speed, the total energy consumption with different frequency operations is also plotted in Figure 20. The energy consumptions are largely affected by the escalation of parasitic capacitances. Due to that reason, it starts consuming more energy while raising the operating frequency. Moving from single NW to five-stacked NW experiences an exponential rise in energy loss in both the devices. Although the single-stacked D5 NW consumes 27% lower energy than the single-stacked D7 NW, it also provides 20% reduction in speed performance. This gap however is quickly normalized while operating at higher frequency. Around 45-50 GHz operation, the D5 device requires four-to five-stacked NW and starts consuming more energy than the two-to three-stacked D7 NW. We have observed that the D5 NW at N5 specification consumes lower energy at the low-frequency operations, but it starts consuming higher energy than the D7 device at high-frequency operation. Multiple stacking increases both on currents as well as parasitic losses. Thus, the NW with shorter fin height should be considered to achieve the benefit of scaling at low-frequency performance.

Conclusion
The GAA lateral nanowire is a promising candidate for scaling beyond the FinFET technology. It provides superior electrostatic benefits such as better SS and DIBL at the shortest gate length. However, it has a very complex process as well as huge impacts of parasitics. The study conducted here was mainly focused on the effects of all the parasitic capacitances on overall device performance with scaling beyond the 7 nm node dimensions targeting the 5 nm node technology. The TCAD study compared the electrostatic performance and compact modeling which shows a basic inverter operation connected as a ring oscillator. A capacitance model was discussed considering the major parasitic components and their impacts on advance dimensional scaling. The compact modeling showed that reducing the wire diameter improved the electrostatics marginally but degraded the overall performance. Therefore, a deep analytical and experimental study is required to conclude the overall scaling benefits of GAA-NW transistor.