Open access peer-reviewed chapter

MAS: Maximum Energy-Aware Sense Amplifier Link for Asynchronous Network on Chip

Written By

Erulappan Sakthivel and Rengaraj Madavan

Submitted: July 29th, 2020 Reviewed: November 19th, 2020 Published: March 23rd, 2021

DOI: 10.5772/intechopen.95075

From the Edited Volume

Network-on-Chip

Edited by Isiaka A. Alimi, Oluyomi Aboderin, Nelson J. Muga and António L. Teixeira

Chapter metrics overview

150 Chapter Downloads

View Full Metrics

Abstract

A real-time multiprocessor chip model is also called a Network-on-Chip (NoC), and deals a promising architecture for future systems-on-chips. Even though a lot of Double Tail Sense Amplifiers are used in architectural approach, the existing DTSA with transceiver exhibits a difficulty of consuming more energy than its gouged design during various traffic condition. Novel Low Power pulse Triggered Flip Flop with DTSA is designed in this research to eliminate the difficulty. The Traffic Aware Sense amplifier MAS consists of Sense amplifiers (SA’s), Traffic Generator, and Estimator. Among various SA’S suitable (DTSA and NLPTF -DTSA) SA are selected and information transferred to the receiver. The performance of both DTSA with Transceiver and NLPTF-DTSA with transceiver compared under various traffic conditions. The proposed design (NLPTF-DTSA) is observed on TSMC 90 nm technology, showing 5.92 Gb/s data rate and 0.51 W total link power.

Keywords

  • network-on-chip (NoC)
  • double tail sense amplifier (DTSA)
  • low power pulse triggered flip flop (LPTF)

1. Introduction

NoC may be a flourishing area for designing current application like Image processing, Signal Processing multimedia, Medical applications telecommunication, and real-time task [1]. Conservative investigation mainly focuses on low power, ultra-speed, and scalability in NoC [2]. Algorithmic [3] and architectural models [4] are made and instigated into the NoC to provide additional performance improvement than current NoC design. Existing NoC designer’s shows much progress on this architectural level model by introducing outside or inside sense amplifier (SA) in on-chip communication [5]. In addition to the transmitter section (TXS) with pre-emphasis capacitance (PEC) for high speed and energy reduction in on-chip communication, it requires DC bias circuits at the receiver section (RXS). To overcome this issue, voltage sense amplifier is presented and tested in 90 nm Complementary metal-oxide–semiconductor (CMOS) cross coupled module [6]. In small circuit application user can’t identify the worth of voltage SA so it is refined into Double Tail Sense Amplifiers (DTSA). This DTSA with transceiver consists of PEC at the transmitter and DTSA at RXS [7]. A low power consumption model is developed and implemented in many real-time applications. Clock Gating (CG) low power design approach at RTL TSMC 45 nm CMOS application is tested in [8]. CMOS Very-large-scale integration (VLSI) design has taken us to real working chips that rely on controlled charge recovery to operate at suggestively lower power dissipation levels than their existing counterparts The Novel Low Power pulse Triggered Flip Flop with DTSA (NLPTF) is designed by using two N-type metal-oxide-semiconductor (NMOS) transistor with an inverted clock signal as an input [9]. The output is taken from the transistor and given to the P-type metal-oxide-semiconductor (PMOS) transistor. The gate output of the PMOS transistor is given to the DTSA as an input and observes the output changes and power usage [10]. In [11] the performance improvement achieved in networks with respect to Network traffic modeling based on synthetic traffic The real-time traffic data are generated and estimated in on-chip communication, according to that TE new approach introduced for Quality of service (QoS) in [12]. This proposed design we followed above Traffic generator and TE in Traffic model and NLPTF-DTSA. The reconfigurable topology is applied in on-chip networks for performance improvement in [13]. To achieve performance improvement than [7], Maximum Energy-Aware sense amplifier (MAS) circuitry is introduced which consists of Traffic Generator (TG), Traffic Estimator (TE), capturing energy recovery [14] and NLPTF-DTSA. Clock gating (CG) Concept discussed in Sakthivel et al. [15].

The rest of this Chapter is ordered as follows. Subdivision 2 addresses the NoC system model. Proposed work and its module details are discussed in subdivision 3. The proposed results of various architectures are presented in subdivision 4. Finally, the conclusion is presented in subdivision 5.

Advertisement

2. System model

For improved data communication in NoC, conventional transceiver consists of PEC in TXS and DTSA circuit in TXS. Schinkel et al. Transceiver for NoC’s and proposed Design is shown in Figure 1. The use of capacitance in TXS is to shrink in power dissipation. In NoC Circuitry communication disturbance occurs because of noise and crosstalk [16]. The transceiver with DIT (differential interconnect twist) affords a high-performance perfection. Early-stage, bidirectional interconnects are used. The EM field solver is used to investigate interconnects. The CMOS with 1.2 V, 6 m technology is used for interconnects as in [7]. Table 1 shows Conventional Strategies.

Figure 1.

Conventional and proposed Transceiver configuration.

CircuitryEXISTINGTAS design
[7]DTSATG [12], TE [12], DTSA
[18]NLPTFNLPTF-DTSA

Table 1.

Conventional Strategies.

Advertisement

3. Proposed work

Figure 2 shows a basic diagram of proposed design, which consist of following modules organized such as PEC with TXS, TE, Controller, NLPTF-DTSA, and RXS. The proposed work consists of four stages namely selection, analysis, and design and performance comparison. In the first stage of our work is suitable SA’(DTSA and NLPTF-DTSA) is selected among various sense amplifier circuitry [18] and second stage selected SA’s with the transceiver (DTSA and NLPTF-DTSA) high traffic (HT) and low traffic (LT) conditions examined. In the third stage, we compared NLPTF-DTSA for complete transceiver with [7].

Figure 2.

Proposed Transceiver Block diagram.

3.1 NLPTF-DTSA circuit

The NLPTF-DTSA [18] is as shown in Figure 3. NLPTF -DTSA is used to solve the troubles associated with conventional Pulsed Flip Flop (P-FF) designs. The basic procedure of NLPTF -DTSA is plummeting the number of NMOS transistors in the discharging path. The next step NLPTF -DTSA is supporting a system to enhance the strength pull down by allocation value in to “1.” The new transistor stacking circuitry is opposed to transistor S2 which is distant from the discharging path. Transistor S2, in conjunction through an extra transistor N3, forms a pass transistor logic (PTL) size of two AND gates of transistor S1. Since the inputs of the two AND logic gates are matching, the output node is reserved at zero time. When input signal1 and input signal 2 are equal to “0”, temporary floating at the node is basically risk-free. By the rising edge limits of the clock unit, both transistors S1 and S2 be turned resting on. This design is subsequently turned on transistor S3 by an instance width. The switching power is less at each node due to weakening voltage swing [10]. The functional diagram of NLPTF-DTSA and simulation results are shown in Figure 3.

Figure 3.

(a) The functional diagram of the V-DTSA module and (b) simulation result.

3.2 TXS with PEC

The technical concepts of TXS with PEC are similar to that of Schinkel et al.

3.3 Low swing transmitter

The series capacitance in transmitter is used to drive the bus and reduces the swing factor. The technical parameters of the full swing (FS), multi VDD mode (MVM) capacitive low swing transmitter (CLS) are tabulated in the Table 2.

ModesInter connectTechnologySupply voltageVoltage swingDriver size
FSShielded single ended1.2 v, 6 metal, 90 nm
CMOS
2 mm, Rwire = 400 Ω
1.2 V1.2 VWn = 8 μm
Wp = 20 μm
MVMDITVDDH=
1. 2 V
VDDL=
1. 08 V
120 mvWn = 8 μm
Wp = 20 μm
CLSDIT1.2 V120 mvWn = 1.6 μm
Wp = 4 μm

Table 2.

Different modes comparison.

3.4 Optimal swing receiver

In a transceiver circuit, SA is the best data receiver when compared to the conventional comparator [7]. To avoid transistor stack, the SA circuit is split into two tails and fed with separate supply voltage.

To gain maximum power reduction in NoC architecture, NLPTF [10] technique is implemented in DTSA module.

3.5 TG and TE

The Statistical Traffic model [12] is implemented in this approach. By which various traffic condition (image, Data) applied into SA’s with Transceiver.

3.6 Complete transceiver

The complete transceiver circuit is made of transmitter connected to the receiver via DIT [7]. The complete transceiver architecture is shown in the Figure 4. And the experimental results of complete transceiver in Figure 5.

Figure 4.

The functional diagram of the Complete Transceiver Circuit.

Figure 5.

Complete Transceiver simulation result.

Advertisement

4. Results and discussions

The performance parameters of the DTSA and NLPTF-DTSA with transceiver are examined using 90 nm technologies. Synopsys Design Compiler is used for Gate level net list creation. Synopsys™ Prime Power is used for Power analysis [17]. The switching factors are reported by the proposed work and examined in Intel® 3.1 GHz LGA 1155 Core i3–2100 Processor, and a system with Window Xp. The technical level similar to [7] carried out various modes such as FS, MVM, and CLS.

The NoC model synthesized code is made to evaluate 90 nm TSMC CMOS technology under the operating frequency of 500 MHZ, 1.2 V supply voltages and 0.5 switching factor. The Sleep mode and Active mode power consumption are tested.

* With CG and without CG The results are presented in Figures 6 and 7. It is inferred that the proposed NLPTF-DTSA gives a greater result in terms of power as related with DTSA modules such as single-ended conditional capturing energy recovery (SCCER) [10], DCCER [10], static differential energy recovery (SDER) [10], pulsed flip flop (PFF) [18], NLPTF-DTSA. A mathematical expression for technical evaluation is similar to [19].

Figure 6.

Power comparison in Sleep mode.

Figure 7.

Power comparison in Active mode.

The energy consumption, delay, data Rate and static power consumption results are presented in Figures 811. The DTSA, NLPTF-DTSA circuitry results are estimated under HT and LT. The overall comparison of various parameters (Energy consumption, delay, data Rate and static power consumption) with current work is shown in Table 3. The overall results of proposed design give greater results than conservative design.

Figure 8.

Energy comparison of DTSA modules.

Figure 9.

Delay comparison of DTSA modules.

Figure 10.

Data Rate comparison of DTSA modules.

Figure 11.

Static Power comparison of DTSA modules.

WorkModule nameTraffic modeData rate GB/S (data rate improvement %)Link power (W)Latency (10 mm of interconnect) single/five stage operation
[7]DTSA5.0 (80%)0.8300/1500
Proposed workDTSALT4.9 (78.4%)0.98345/1725
Proposed workDTSAHT4.0 (64%)1.32487/2435
Proposed workNLPTF-DTSAAverage5.92 (94.72%)0.51497/2485

Table 3.

The overall transceiver performance comparison.

The conservative method has achieved latency of 300/1500 ps under various stage operation (one and Five). The latency result of the MAS work is slightly increased to 440/2200 ps.

Though the latency results are high, still it is encouraging because we added traffic generator and Traffic Estimator.

Advertisement

5. Conclusion

The proposed work is summarized into Three stages namely selection, analysis, design and performance comparison. In the first stage, among various SA’s suitable SA’S (DTSA and NLPTF-DTSA) is selected for MAS process. In the second stage, Traffic action takes place according to both DTSA with Transceiver and NLPTF-DTSA Transceiver.

On the Final stage, we compared both above circuitry and concluded under various traffic conditions NLPTF-DTSA is suitable. The result of the complete transceiver circuit (NLPTF-DTSA) under average traffic mode is attained as 5.92 Gb/s data rate, 0.62 W link power and latency of 497 ps/2485 ps for single/five stage operation. When compared with conservative methods, results in MAS design show performance enhancement of 94.72% in data rate and 0.51 W reductions in link power. Though latency of MAS design is high, it is acceptable because of the addition of TG and TE, which is not present in conventional NoC architecture. In future we will improve the performance of NoC Architecture with respect to latency.

References

  1. 1. Marculescu R, Bogdan P, “The chip is the network toward a science of Network-on-Chip Design,” ELECTRON DES, vol. 2, pp. 371-461, 2007.
  2. 2. Moraes F, Calazans N, Mello A, Muller L, Ost L,“Hermes: an infrastructure for low area overhead packet switching networks on chip,” INTEGRATION, vol. 38, pp. 69-93, Oct. 2004.
  3. 3. McKeown N,“The islip scheduling algorithm for input queued switches,” IEEE ACM T NETWORK, vol. 7, pp. 118-201, Apr. 1999.
  4. 4. Fang FW, Wong, MDF, Chang YW, “Flip chip routing with unified area io pad assignments for package-board co design,” in Conf. IEEE DAC, 2009 pp. 336-339.
  5. 5. Liu YI , Liu G, Yang Y, Li Z, “A novel low swing transceiver for interconnection between NoC routers,” in Conf. IEEE DCMT, 2011, pp. 39-44
  6. 6. Larsson P, “Resonance and damping in cmos circuits with on chip decoupling capacitance,” IEEE T CIRCUITS I, vol. 45, pp. 849-858, Jul. 1998.
  7. 7. Schinkel D, Mensink E, Klumperink EAM, Tuijl EV, Nauta B, “Low power high speed transceivers for network-on-chip communication,” IEEE T VLSI SYST, vol. 17, pp. 12-21, Jan. 2009.
  8. 8. Zhao P, McNeely J, Kaung , Wang N, Wang Z, “Design of sequential elements of the low power clocking system,” IEEE T VLSI SYST, vol. 19, pp. 914-918, May. 2011.
  9. 9. C.-C. Yu. Design of low power double edge triggered flip-flop circuit. In: Proc. 2nd IEEE Conf. Industrial Electronics Applications, 2007, pp. 2054-2057.
  10. 10. V. Stojanovic and V. G. Oklobdzija, “Comparative analysis of master-slave latches and flip-flops for high-performance and low-power systems,” IEEE Journal of Solid-State Circuits, vol. 34, no. 4, pp. 536-548, 1999.
  11. 11. Lu Z, Jantsch A, “Traffic configuration for evaluating networks on chips”, in Conf. IEEE SoC for Real Time Applications Proceedings, 2005, pp.535-540
  12. 12. Xingwei W , Dingde.J , Zhengzheng X, henhua C, “An accurate method to estimate traffic matrices from link loads for QoS provision,” Journal of Communications and Networks, vol. 12, pp. 624-631, Dec. 2010
  13. 13. Kun Wang, Xi an, Changshan Wang, Huaxi Gu, “Quality of service routing algorithm in the torus based network on chip,” in Conf. ASICON '09. IEEE, 2009, pp. 952-954.
  14. 14. Junsheng Lv , Beijing, Hainan Liu , Ye M , Yumei Zhou, “An energy recovery D flip flop for low power semi custom ASIC design,” in Conf. Micro electronics and Electronics, 2010, pp. 33-36.
  15. 15. Erulappan Sakthivel, Veluchamy Malathi, and Muruganantham Arunraja, “VELAN: Variable Energy Aware Sense Amplifier Link for Asynchronous Network on Chip,” Circuits and Systems, vol. 7, pp. 128-144, 2016.
  16. 16. Schinkel D, Mensink E, Klumperink EAM, Tuijl EV, Nauta B,” A 3-Gb/s/ch transceiver for 10 mm uninterrupted RC limited global on chip interconnects,” IEEE J SOLID-ST CIRC , vol. 41, pp. 297-306., Jan. 2006.
  17. 17. Synopsys, Inc., Mountain View, CA [Online]. Available:http://www.synopsys.com
  18. 18. Hwang YT, Lin, Fa J , Sheu MH, “Low power pulse triggered flip flop design with conditional pulse enhancement scheme,” IEEE T VLSI SYST , vol. 20, pp. 361-366, Feb. 2012.
  19. 19. Qiaoyan Yu, Paul Ampad, “A flexible and parallel simulator for networks on chip with error control,” IEEE T COMPUT AID D , vol. 29, pp. 103-116, Jan. 2010.

Written By

Erulappan Sakthivel and Rengaraj Madavan

Submitted: July 29th, 2020 Reviewed: November 19th, 2020 Published: March 23rd, 2021