MAS: Maximum Energy-Aware Sense Amplifier Link for Asynchronous Network on Chip

A real-time multiprocessor chip model is also called a Network-on-Chip (NoC), and deals a promising architecture for future systems-on-chips. Even though a lot of Double Tail Sense Amplifiers are used in architectural approach, the existing DTSA with transceiver exhibits a difficulty of consuming more energy than its gouged design during various traffic condition. Novel Low Power pulse Triggered Flip Flop with DTSA is designed in this research to eliminate the difficulty. The Traffic Aware Sense amplifier MAS consists of Sense amplifiers (SA’s), Traffic Generator, and Estimator. Among various SA’S suitable (DTSA and NLPTF -DTSA) SA are selected and information transferred to the receiver. The performance of both DTSA with Transceiver and NLPTF-DTSA with transceiver compared under various traffic conditions. The proposed design (NLPTF-DTSA) is observed on TSMC 90 nm technology, showing 5.92 Gb/s data rate and 0.51 W total link power. The conservative method has achieved latency of 300/1500 ps under various stage operation (one and Five). The latency result of the MAS work is slightly increased to 440/2200 ps. Though the latency results are high, still it is encouraging because we added traffic generator and Traffic Estimator.


Introduction
NoC may be a flourishing area for designing current application like Image processing, Signal Processing multimedia, Medical applications telecommunication, and real-time task [1]. Conservative investigation mainly focuses on low power, ultra-speed, and scalability in NoC [2]. Algorithmic [3] and architectural models [4] are made and instigated into the NoC to provide additional performance improvement than current NoC design. Existing NoC designer's shows much progress on this architectural level model by introducing outside or inside sense amplifier (SA) in on-chip communication [5]. In addition to the transmitter section (TXS) with pre-emphasis capacitance (PEC) for high speed and energy reduction in on-chip communication, it requires DC bias circuits at the receiver section (RXS). To overcome this issue, voltage sense amplifier is presented and tested in 90 nm Complementary metal-oxide-semiconductor (CMOS) cross coupled module [6]. In small circuit application user can't identify the worth of voltage SA so it is refined into Double Tail Sense Amplifiers (DTSA). This DTSA with transceiver consists of PEC at the transmitter and DTSA at RXS [7]. A low power consumption model is developed and implemented in many real-time applications. Clock Gating (CG) low power design approach at RTL TSMC 45 nm CMOS application is tested in Network-on-Chip 2 [8]. CMOS Very-large-scale integration (VLSI) design has taken us to real working chips that rely on controlled charge recovery to operate at suggestively lower power dissipation levels than their existing counterparts The Novel Low Power pulse Triggered Flip Flop with DTSA (NLPTF) is designed by using two N-type metaloxide-semiconductor (NMOS) transistor with an inverted clock signal as an input [9]. The output is taken from the transistor and given to the P-type metal-oxidesemiconductor (PMOS) transistor. The gate output of the PMOS transistor is given to the DTSA as an input and observes the output changes and power usage [10]. In [11] the performance improvement achieved in networks with respect to Network traffic modeling based on synthetic traffic The real-time traffic data are generated and estimated in on-chip communication, according to that TE new approach introduced for Quality of service (QoS) in [12]. This proposed design we followed above Traffic generator and TE in Traffic model and NLPTF-DTSA. The reconfigurable topology is applied in on-chip networks for performance improvement in [13],. To achieve performance improvement than [7], Maximum Energy-Aware sense amplifier (MAS) circuitry is introduced which consists of Traffic Generator (TG), Traffic Estimator (TE), capturing energy recovery [14] and NLPTF-DTSA. Clock gating (CG) Concept discussed in Sakthivel et al. [16].
The rest of this Chapter is ordered as follows. Subdivision 2 addresses the NoC system model. Proposed work and its module details are discussed in subdivision 3. The proposed results of various architectures are presented in subdivision 4. Finally, the conclusion is presented in subdivision 5.  Figure 1. The use of capacitance in TXS is to shrink in power dissipation. In NoC Circuitry communication disturbance occurs because of noise and crosstalk [15]. The transceiver with DIT (differential interconnect twist) affords a high-performance perfection. Early-stage, bidirectional interconnects are used. The EM field solver is used to investigate interconnects. The CMOS with 1.2 V, 6 m technology is used for interconnects as in [7]. Table 1 shows Conventional Strategies.  Figure 2 shows a basic diagram of proposed design, which consist of following modules organized such as PEC with TXS, TE, Controller, NLPTF-DTSA, and RXS. The proposed work consists of four stages namely selection, analysis, and design and performance comparison. In the first stage of our work is suitable SA'(DTSA and NLPTF-DTSA) is selected among various sense amplifier circuitry [18] and second stage selected SA's with the transceiver (DTSA and NLPTF-DTSA) high traffic (HT) and low traffic (LT) conditions examined. In the third stage, we compared NLPTF-DTSA for complete transceiver with [7].

NLPTF-DTSA circuit
The NLPTF-DTSA [18] is as shown in Figure 3. NLPTF -DTSA is used to solve the troubles associated with conventional Pulsed Flip Flop (P-FF) designs. The basic procedure of NLPTF -DTSA is plummeting the number of NMOS transistors in the discharging path. The next step NLPTF -DTSA is supporting a system to enhance the strength pull down by allocation value in to "1." The new transistor stacking circuitry is opposed to transistor S2 which is distant from the discharging path. Transistor S2, in conjunction through an extra transistor N3, forms a pass transistor logic (PTL) size of two AND gates of transistor S1. Since the inputs of the two AND logic gates are matching, the output node is reserved at zero time. When input signal1 and input signal 2 are equal to "0", temporary floating at the node is basically risk-free. By the rising edge limits of the clock unit, both transistors S1 and S2 be turned resting on. This design is subsequently turned on transistor S3 by an instance width. The switching power is less at each node due to weakening voltage swing [10]. The functional diagram of NLPTF-DTSA and simulation results are shown in Figure 3.

TXS with PEC
The technical concepts of TXS with PEC are similar to that of Schinkel et al.

Low swing transmitter
The series capacitance in transmitter is used to drive the bus and reduces the swing factor. The technical parameters of the full swing (FS), multi VDD mode (MVM) capacitive low swing transmitter (CLS) are tabulated in the Table 2.

Optimal swing receiver
In a transceiver circuit, SA is the best data receiver when compared to the conventional comparator [7]. To avoid transistor stack, the SA circuit is split into two tails and fed with separate supply voltage.
To gain maximum power reduction in NoC architecture, NLPTF [10] technique is implemented in DTSA module.

TG and TE
The Statistical Traffic model [12] is implemented in this approach. By which various traffic condition (image, Data) applied into SA's with Transceiver.    Network-on-Chip 6

Complete transceiver
The complete transceiver circuit is made of transmitter connected to the receiver via DIT [7]. The complete transceiver architecture is shown in the Figure 4. And the experimental results of complete transceiver in Figure 5.

Results and discussions
The performance parameters of the DTSA and NLPTF-DTSA with transceiver are examined using 90 nm technologies. Synopsys Design Compiler is used for Gate level net list creation. Synopsys™ Prime Power is used for Power analysis [17]. The switching factors are reported by the proposed work and examined in Intel® 3.1 GHz LGA 1155 Core i3-2100 Processor, and a system with Window Xp. The technical level similar to [7] carried out various modes such as FS, MVM, and CLS.
The NoC model synthesized code is made to evaluate 90 nm TSMC CMOS technology under the operating frequency of 500 MHZ, 1.2 V supply voltages and 0.5 switching factor. The Sleep mode and Active mode power consumption are tested.
* With CG and without CG The results are presented in Figures 6 and 7. It is inferred that the proposed NLPTF-DTSA gives a greater result in terms of power as related with DTSA modules such as single-ended conditional capturing energy recovery (SCCER) [10], DCCER [10], static differential energy recovery (SDER) [10], pulsed flip flop (PFF) [18], NLPTF-DTSA. A mathematical expression for technical evaluation is similar to [19].
The energy consumption, delay, data Rate and static power consumption results are presented in Figures 8-11. The DTSA, NLPTF-DTSA circuitry results are estimated under HT and LT. The overall comparison of various parameters (Energy consumption, delay, data Rate and static power consumption) with current work is shown in Table 3. The overall results of proposed design give greater results than conservative design.      The conservative method has achieved latency of 300/1500 ps under various stage operation (one and Five). The latency result of the MAS work is slightly increased to 440/2200 ps.
Though the latency results are high, still it is encouraging because we added traffic generator and Traffic Estimator.

Conclusion
The proposed work is summarized into Three stages namely selection, analysis, design and performance comparison. In the first stage, among various SA's suitable SA'S (DTSA and NLPTF-DTSA) is selected for MAS process. In the second stage, Traffic action takes place according to both DTSA with Transceiver and NLPTF-DTSA Transceiver.
On the Final stage, we compared both above circuitry and concluded under various traffic conditions NLPTF-DTSA is suitable. The result of the complete transceiver circuit (NLPTF-DTSA) under average traffic mode is attained as 5.92 Gb/s data rate, 0.62 W link power and latency of 497 ps/2485 ps for single/five stage operation. When compared with conservative methods, results in MAS design show performance enhancement of 94.72% in data rate and 0.51 W reductions in link power. Though latency of MAS design is high, it is acceptable because of the addition of TG and TE, which is not present in conventional NoC architecture. In future we will improve the performance of NoC Architecture with respect to latency.  © 2021 The Author(s). Licensee IntechOpen. This chapter is distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/ by/3.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.