InTechOpen uses cookies to offer you the best online experience. By continuing to use our site, you agree to our Privacy Policy.

Materials Science » Semiconductors » "Field - Programmable Gate Array", book edited by George Dekoulis, ISBN 978-953-51-3208-0, Print ISBN 978-953-51-3207-3, Published: May 31, 2017 under CC BY 3.0 license. © The Author(s).

Chapter 10

Power Efficient Data-Aware SRAM Cell for SRAM-Based FPGA Architecture

By Ajay Kumar Singh
DOI: 10.5772/67257

Article top


Logic blocks in dual‐Vdd and Vdd‐programmable FPGAs [59].
Figure 1. Logic blocks in dual‐Vdd and Vdd‐programmable FPGAs [59].
Proposed new programmable low‐power FPGA routing switch [65].
Figure 2. Proposed new programmable low‐power FPGA routing switch [65].
Leakage power breakdown in Xilinx Spartan [79].
Figure 3. Leakage power breakdown in Xilinx Spartan [79].
Dynamic power breakdown in Xilinx Virtex‐4 [80].
Figure 4. Dynamic power breakdown in Xilinx Virtex‐4 [80].
Architecture of the conventional 6T SRAM cell [88].
Figure 5. Architecture of the conventional 6T SRAM cell [88].
Architecture of 10‐T subthreshold bitcell [90].
Figure 6. Architecture of 10‐T subthreshold bitcell [90].
(a) Architecture of DDWAD SRM cell. (b) Circuit to generate appropriate WS signal depending on write operation [107].
Figure 7. (a) Architecture of DDWAD SRM cell. (b) Circuit to generate appropriate WS signal depending on write operation [107].
Total power consumption in data aware cell [107].
Figure 8. Total power consumption in data aware cell [107].
Read power consumption [107].
Figure 9. Read power consumption [107].
Hold leakage power at various power supplies [107].
Figure 10. Hold leakage power at various power supplies [107].
Proposed decoder [108].
Figure 11. Proposed decoder [108].
Write power consumption in 32 kb SRAM array.
Figure 12. Write power consumption in 32 kb SRAM array.
Read power consumption in 32 kb array.
Figure 13. Read power consumption in 32 kb array.

Power Efficient Data-Aware SRAM Cell for SRAM-Based FPGA Architecture

Ajay Kumar Singh
Show details


The design of low-power SRAM cell becomes a necessity in today's FPGAs, because SRAM is a critical component in FPGA design and consumes a large fraction of the total power. The present chapter provides an overview of various factors responsible for power consumption in FPGA and discusses the design techniques of low-power SRAM-based FPGA at system level, device level, and architecture levels. Finally, the chapter proposes a data-aware dynamic SRAM cell to control the power consumption in the cell. Stack effect has been adopted in the design to reduce the leakage current. The various peripheral circuits like address decoder circuit, write/read enable circuits, and sense amplifier have been modified to implement a power-efficient SRAM-based FPGA.

Keywords: FPGA, ASIC, static power, dynamic power, leakage current, SRAM cell, subthreshold cell, data-aware SRAM cell

1. Introduction

Field programmable gate array (FPGA) is prefabricated integrated circuit (IC), which contains programmable gate matrix to implement logic functions and interconnect resources to connect the logic functions and I/O blocks. These interconnect resources can be electrically programmed by the user to implement any digital circuits and systems. Due to faster time to market, lower cost, and flexibility, FPGA prefers over ASIC (application‐specific IC) design although it has disadvantages like larger size, slower speed, and larger power consumption. Due to the flexibility of FPGA, it is possible to partially program any portion of the FPGA depending on the requirement even when the rest of an FPGA is still running. Computer‐aided design (CAD) tools and architecture are the two important technologies, which differentiate FPGAs. First memory‐based programming FPGAs were introduced in 1986 by Xilinx Inc., San Jose, CA [1].

The programmable term in FPGA only reflects that any new function can be implemented on the chip even after its fabrication. Programmability/reconfigurability of an FPGA is based on an underlying programming technology, which can cause a change in behavior of a prefabricated chip. The main programming technologies used in FPGAs are static random memory (SRAM), flash memory, and antifuse [25].

The SRAM‐based FPGAs provide ideal prototyping medium and are widely used to integrate FPGAs in an embedded system [68] due to the use of standard CMOS technologies, higher performance, and reprogrammability. However, the larger static power consumption in SRAM cell limits the use of SRAM‐based FPGAs in portable embedded system compared to flash‐based FPGAs [9, 10]. The other concern related to SRAM‐based FPGA is its volatile nature. Although the dynamic power management and duty‐cycling techniques [11, 12] have been used to save static power during idle mode of FPGA, these techniques are not very effective due to the energy consumption associated with the resulting reconfiguration process. Due to large load capacitance and high access rate, SRAM cells are responsible for consuming significant portion of the total power of the design. Thus, SRAM power consumption is an important consideration for designers to find the balance between the performance and the overall power consumption. The speed of the SRAM cell in FPGA is not a critical factor because it does not affect the operating speed of the circuit implemented in FPGA as mentioned in ref. [13].

In this chapter, we investigate the various factors responsible for power consumption in SRAM‐based FPGAs and review the different techniques proposed in the literature to save the power. We will also consider the static and dynamic power in the conventional 6T SRAM cell and its architecture. Various design techniques, presented in the literature, to reduce power consumption in SRAM cell will be reviewed in detail with their merits and demerits. A data‐aware power‐efficient SRAM cell will be discussed to save power and to optimize the stability.

2. SRAM‐based FPGAs

SRAM cells are the basic cells used for SRAM‐based FPGA. These cells are scattered throughout the design in form of an array and mainly used to program: (1) the routing interconnects of FPGAs and (2) configurable logic blocks (CLBs) that are used to implement logic functions. SRAM‐based programming technology has become the dominant approach for FPGAs because of its reprogrammability and the use of standard CMOS process technology, which results in larger package density and higher speed. Due to the volatile nature of SRAM technology, SRAM‐based FPGAs lose their configured data whenever power supply is switched off and need to be reprogrammed every time when the power supply is turned on. Hence, almost every system using SRAM‐based FPGAs contains an additional nonvolatile memory such as flash programmable read only memory (PROM) or EEPROM to store the configuration data and load it into the SRAM‐based FPGA whenever power is on. In many applications, a complex programmable logic device (CPLD) is used in addition to the external configuration memory to perform the vital functions of the system necessary at power‐up. The first static memory‐based FPGA (commonly called an SRAM‐based FPGA) was proposed by Wahlstrom in 1967 [14]. This architecture is allowed for both logic and interconnection configuration using a stream of configuration bits. From a practical standpoint, an SRAM cell can be programmed indefinite number of times. Dedicated circuitry on the FPGA initializes all the SRAM bits on power up and configures the bits with a user‐supplied configuration. No special processing steps are needed in SRAM cells unlike other programming technologies. Although static memory offers the most flexible approach for device programmability, it imposes a significant area penalty per programmable switch compared to ROM implementations.

3. Power consumption in SRAM‐based FPGAs

In the recent years, the traditional FPGA research area has shifted from speed and area overhead issues to design of power‐efficient FPGAs due to increased applications of FPGA in portable and nonportable devices. In portable devices power saving is required to enhance the battery life time, whereas in nonmobile devices power saving decides the cost, performance, and reliability of the device. The main sources of power consumption in FPGA are static and dynamic power [10, 12, 15, 16].

Static power is consumed when device/system is idle and leakage current flows in the system. The various leakage currents in OFF transistor are subthreshold leakage current, gate‐induced drain leakage, junction leakage current, and direct tunneling current [1719].

Dynamic power consumption is due to the switching activity of the transistors in normal operational mode. The dynamic power consumption depends on the parasitic capacitance, power supply, switching activity, and frequency of operation and mathematically expressed as [20]:

where CL is the load capacitance, Vdd is the power supply, f is the frequency of operation, and η is the switching activity.

FPGA design consumes larger static power than the ASIC design due to excessive leakage currents [2123], which is due to more number of transistors per logic. Other components, which are responsible for larger power consumption, are circuits used to provide flexibility to FPGA, number of configuration bits, lookup‐tables (LUTs), and presence of large number of programmable switches.

4. Techniques adopted to reduce power consumption in SRAM‐based FPGA

4.1. Leakage power reduction

The important method to control the leakage current in the system is to switch off the transistors, which are not being used at that time. This can be achieved by using the dual threshold voltage transistor FPGA routing design [2426]. In this technique, high threshold voltage is applied to one subset of multiplexer transistors and low threshold voltage to the rest of the transistors. High threshold voltage controls the leakage current effectively on the cost of performance degradation. This technique increases the complexity at router level. By allowing body‐bias effect, the threshold voltage of a multiplexer transistor, which is not a part of the selected path, can be raised [27]. This method increases the fabrication complexity and cost. The leakage current can also be controlled by applying negative bias voltage on the gate of the OFF multiplexer transistor, which results in drastic drop in subthreshold current on the cost of hardware burden [28].

Stack effect is another effective method to reduce the leakage current in any circuit [2931]. Stack effect means two series connected OFF transistors in the same path. These two OFF transistors offer a high resistive path to the current flow. To utilize this concept in the FPGA design, researchers [32, 33] have introduced an extra configuration SRAM cells (redundant cells) to allow multiple OFF transistors on unselected path. Due to redundant cell approach, the unselected path contains two OFF transistors, which limits the subthreshold current along the unselected path.

Calhoun et al. [34] have proposed the creation of fine‐grained “sleep region” to control the leakage current in the system. With this technique, it becomes possible to put unused LUTs and flip‐flops to sleep mode independently. Gayasen et al. [35] have proposed coarse‐grained sleep strategy. In this technique, the entire region of the FPGA is partitioned into logic blocks so that each region can be put into sleep mode independently whenever it is not used.

Several methods have been proposed by researchers to save the leakage/static power consumption in FPGA design at the architectural level [3639]. Tran et al. [40] have proposed low‐power FPGA architecture based on fine‐grained Vdd control scheme, called micro‐Vdd‐hopping. They have grouped four CLB into one block to share the Vdd. In the micro‐Vdd‐hopping scheme, Vdd of each block is varied between high and low Vdd to save power consumption without scarifying performance. In their design, they have introduced a level shifter and incorporated zigzag power‐gating scheme to control the sneak leakage path problem. They have experimentally observed that the dynamic power can be reduced by 86% when the required speed is half of the highest speed. They have simulated their proposed designed at 90 nm technology and observed that 95% static power saving on the cost of 2% area overhead. In zigzag power gating scheme wake up time is smaller than other gating technique because the INVs and 2‐NAND are always in between Vdd and Vss during standby mode. Since they have off‐off stacking structure, leakage current is suppressed by an order of magnitude even if the overdrive voltage is zero.

Srinivasan et al. [41] have proposed a technique to reduce the leakage current of interconnect fabric. They have put every multiplexer in its least‐leakage state by setting its undriven inputs to desired values with a circuit‐level modification in the routing multiplexer. The main advantage of this technique is that it has negligible impact on the performance of the design and has small area penalty.

In their research paper, Hasan et al. [42] have reduced the leakage current in the multiplexer‐based interconnect matrix by controlling the inputs of unused FPGA routing multiplexers. The simulation results on different sizes and topologies of routing multiplexers show that the minimum leakage vector varies significantly at 22 nm compared to the 65 nm nodes because of higher gate leakage current and output stage loading effects. Their proposed technique reduces the static power significantly without imposing any area overhead because most of the routing multiplexers are unused in an FPGA.

A directional coarse‐grained power‐gated FPGA switch box and power gating aware routing algorithm was proposed by Hoo et al. [43] to address the leakage current concern in FPGA. After considering the trade‐offs among different PG designs, authors have considered: (1) A novel directional coarse‐grained power‐gated FPGA switch box. (2) A power‐aware routing algorithm to leverage on new PG architecture. In their proposed architecture, multiple buffers in each direction of the switch box are power gated independently of the buffers in the other directions. Due to the homogeneous structure of the switch box, proper sizing of the sleep transistors is not an issue. To maximize the leakage reduction of the coarse‐grained PG architecture, they have also adopted the routing algorithm. They have proposed a new cost function for the VPR routing algorithm to support the new routing architecture.

4.2. Dynamic power reduction

Dynamic power is consumed during normal operation when switch toggles. It depends on the frequency of the operation, load capacitance, and square of power supply as clear from Eq. (1). The total dynamic power consumed by a device is given by the sum of the dynamic power of each resource. Due to the programmability of the FPGA, the dynamic power is design dependent. The important contributors for dynamic power are effective parasitic capacitance of the resources, resource utilization, and switching activity of the resources [44]. The effective capacitance of the resources come from parasitic capacitance of interconnect wires and transistors. The dynamic power of the device can be reduced by addressing each of the parameters in Eq. (1) effectively. Various methods have been proposed by researchers to handle the dynamic power consummation [37, 4547]. The general adopted methods are using clock scheme, reducing toggling activity of the logic, reducing RAM and I/O powers.

Since faster switching logic consumes more dynamic power than the slower switching logic, it is required to partition the clock so that the fast clock should be assigned to those portions of the logic which require a fast clock and slow clock should be assign to those which can be run at a slower speed. This way the switching activity of various logics can be controlled to save the overall dynamic power [9, 10, 15].

Dynamic voltage scaling is another power‐saving design technique because supply voltage significantly impacts power efficiency. The power supply scaling technique can be utilized in the design of power‐efficient FPGA by considering devices like tunnel‐FET, FinFET, etc. [4851] because these devices can operate at ultra‐low voltage.

The dual or multi‐Vdd techniques [5254] are other important methods to save the dynamic power. In dual Vdd scheme, the noncritical delay circuit is connected with low power supply, whereas delay‐critical circuit is powered by high voltage. This concept is also applied in the FPGA design [5557]. In heterogeneous architecture, some logic blocks are fixed to operate at high power supply and some logic blocks (not limited by speed) are fixed to operate at low voltage. This heterogeneous scheme helps only in small power saving due to the rigidity of the fixed fabric and loss associated with the mandatory use of low‐Vdd in certain cases. The dual Vdd technique cannot be applied to the interconnect wires which is the main source of power consumption. To overcome this problem, Li et al. [58] have proposed Vdd programmability technique to reduce power consumption of interconnect wire. They have selectively applied low‐Vdd to interconnect circuits such as routing and connection switches. The Vdd selection for different applications is obtained by programmable dual‐Vdd technique to both logic blocks and interconnect. On average, they observed a total of 50-55% power is reduction.

Although voltage scaling is the best way to reduce the power consumption in FPGA array, one has to scarify the performance of the circuit. To improve the power efficiency of FPGAs without scarifying performance, Li et al. [59] have explored the different supply voltage (Vdd) levels option. According to the authors, a predefined dual‐Vdd FPGA fabric, in general, cannot achieve better power performance trade‐off than the Vdd scaling because the predefined dual‐Vdd fabric is not flexible enough for a variety of applications. To address this issue they have introduced the field programmability for the Vdd level by proposing three types of logic blocks: H‐block, L‐block, and a p‐block as shown in Figure 1. H‐block and L‐block are connected to supply voltages VDDH and VDDL, respectively. H‐block provides higher speed due to high supply voltage whereas L‐block has reduced power consumption at the cost of the increased delay. They have implemented P‐block by inserting PMOS transistors (called power switches) between the power supply rails and the logic block. The configuration bits were used to control the switching behavior of these switches so that an appropriate supply voltage can be chosen for the P‐block. To avoid the short circuit current, they have introduced a level converter in between VDDH and VDDL.


Figure 1.

Logic blocks in dual‐Vdd and Vdd‐programmable FPGAs [59].

Selective power‐down is another method to save power in FPGA. This technique (known as power gating) refers to shut down the power supply of certain portions of a chip which are not performing any task for a long time to save the static power considerably. This can be achieved by implementing a multisupply strategy in which the power grid of some blocks is decorrelated from others in order to allow for selective shutdown. Sleep modes within the FPGA architecture can also be deployed to selectively reduce the power supply of those blocks, which are not in use [60, 61].

Power consumption in interconnect dominates dynamic power in FPGAs [6264] due to the interconnect structure, which consist of prefabricated wire segments. Each segment is attached with used and unused switches. Wire lengths in FPGAs are generally longer than in ASICs due to the larger area consumed by SRAM cells and circuitry. The larger power consumption in interconnect in FPGA makes it high‐level target for power optimization. Anderson et al. [65] have presented a novel FPGA routing switch design to reduce the leakage and dynamic power consumption. The switch can be programmed to operate in any one of the mode: high speed, low speed, or sleep mode. In high‐speed mode, power and performance characteristics are similar to those of current FPGA routing switches. Low‐power mode offers reduced leakage and dynamic power on the cost of degraded performance. Sleep mode, which is suitable for unused switches, reduces the static power drastically. Three key observations (which hold for majority of Xilinx Spartan‐3 commercial FPGA and are specific to FPGA interconnect) were made, namely (1) routing switch inputs are tolerant to “weak‐1” signals, (2) there exists sufficient timing slack in typical FPGA designs to allow a considerable fraction of routing switches to be slowed down, without impacting the overall design performance, and (3) most routing switches simply feed other routing switches, authors have proposed the design of new switch as shown in Figure 2. The designed switch includes parallel combination of NMOS and PMOS sleep transistors which can operate in three different modes as follows: In high‐speed mode, the PMOS is turned ON which results in full rail‐to‐rail swing of output. The gate terminal of NMOS is left at Vdd in high‐speed mode. During 0–1 logic transition the virtual Vdd may temporarily drop below Vdd - VTH, causing the NMOS to leave cut‐off and assist with charging the switch's output load. In low‐power mode, the PMOS is turned OFF and NMOS is turned ON. The buffer is powered by the reduced voltage, VVD ≈ Vdd – VTH.


Figure 2.

Proposed new programmable low‐power FPGA routing switch [65].

Clock‐gating is an effective and most widely used method to reduce the dynamic power. This technique is based on the principle that only active portion of the system should be connected to the clock tree and others should not be served by the clock tree. A logic circuit must be included in the design for the selection of which portions are clocked and which portions are blocked. This reduces switching activity which results in dynamic power saving. The clock gating can be applied at the chip level as well as at the design level. The gating technique has been successfully used in ASICs, but it is not very effective in SRAM‐based FPGAs because a large component of power consumption in FPGA is due to the switching activities of the clock signals along the routing switches. For this reason, researchers investigated the possibility of modifying the way a circuit is mapped on the FPGA array by acting on the synthesis, technology mapping, or placement and routing algorithms [66, 67]. Since clock is distributed in the chip through the global FPGA routing network, the placement of clock loads has a considerable impact on clock wire usage. Clock load placement should be done in such a way that one should get lower clock capacitance, which results in lower dynamic power consumption.

Placement and routing (P&R) on the chip also affects the dynamic power consumption because it decides the total parasitic capacitance in the design. To minimize the parasitic capacitance, it is essential to optimize the P&R strategy. It is always advisable to place two connected functional instances closer because it will reduce the interconnect wire‐length which in turn can reduce the capacitive loading of the net and lead to dynamic power reduction. The modern FPGA development software typically supports power‐driven layout to automatically accomplish this task. Power‐driven layout tools examine connection between functional instances for optimization [6870]. Power‐analysis tools are used to further optimize the power saving. Power‐analysis tools examine each subcomponent in a design hierarchy to highlight power consumption. Careful examination of this information and subsequent manipulation of the design can result in significant power savings.

Reducing the power supply of I/O can save up to 80% dynamic power. The switching activity of I/O can be controlled by using techniques like time multiplexing, minimum I/O count design portioning [7173], and reducing I/O drive strength/slew rates. A considerable amount of dynamic power can be saved by adopting differential I/O standards and resistively terminated I/O standards for highest toggling frequency and single‐ended I/O standard for low toggling frequency.

Tsang et al. [74] have studied the effectiveness of employing precomputation in reducing dynamic power consumption in commercial off‐the‐shelf (COTS) FPGAs. Precomputation is a high‐level logic optimization technique that lowers power consumption of a design by disabling part of the circuit based on a few relatively simple precomputation conditions. With careful design considerations and increased logic utilization, its associated power consumption can be reduced by disabling much larger part of the design with negligible increase in resource overhead.

In the literature, several techniques/methods are presented in detail to address the issue of dynamic power consumption in FPGA [10, 7577].

5. SRAM power reduction

The design of low power and high performance SRAM cell becomes a necessity in today's FPGAs because SRAM is a critical component in FPGA design. Although SRAM‐based FPGA acquires larger area on the chip but still one of the most useful SRAM‐based structure is the lookup table (LUT).

SRAM‐based FPGAs such as those manufactured by Xilinx and Altera comprise the largest fraction of the overall market. These FPGAs utilize SRAM for routing and programmability, typically through the use of LUTs and multiplexers. Due to the large number of cells within SRAM FPGA interconnects, a considerable leakage current (of order of milliamps) flows at standby [78]. However, leakage current increases as process geometry shrinks which further exacerbates the power problem. The dynamic power consumption in cell is a serious threat because of large parasitic capacitance (due to longer metallic bitline) which results in larger charging/discharging activity at the bitline. Study on the leakage current and dynamic power in Xilinx Spartan‐3 FPGA [79] (Figure 3) and Xilinx Virtex‐4 [80] (Figure 4) show that the major contributor for power consumption in FPGA is configurable SRAM; hence, the new design technique becomes essential to increase the lifetime of the battery. Several techniques have been proposed in the literature [8185] to address the power consumption problem in SRAM cell. It is worth to disable the SRAM devices that are temporarily unused. This technique will avoid the power consumption by unused components. A system controller can deactivate the device when it is not required in the current operation, or put the device in its sleep mode when that device will not be accessed for an extended period of time. Implementing such a system controller in FPGA reduces the overall switching activity of the system. As discussed by Tuan et al. [86], the data of the configurable SRAM cell alter only when FPGA is configured. FPGA is configured only when power supply is turned on. Therefore, it is necessary to control the leakage current in the cell during idle phase to save the overall power.


Figure 3.

Leakage power breakdown in Xilinx Spartan [79].


Figure 4.

Dynamic power breakdown in Xilinx Virtex‐4 [80].

Wang et al. [87] have proposed the design of an ultra‐low voltage 9T SRAM cell. Their designed cell consists of a 6T SRAM part (for write operation) and a dedicated read port. The read port comprises three NMOS transistors for realizing equalized bitline leakage and improving bitline sensing margin in a single‐ended read bitline (RBL). The write access paths and the data storage latch are implemented with HVT devices for leakage reduction while the read port employs LVT devices for better performance. Their test chip shows an improvement of 40% in energy efficiency with the minimum energy per operation of 2.07 pJ at 0.4 V. This design increases the fabrication complexity due to the use of LVT and HVT transistors.

Although much research has been done in order to design a power‐efficient SRAM circuit, still interest in power‐efficient cell design at the architecture level continues to increase due to the occupation of considerable fraction of total area on chip by configurable SRAM cells and circuitry in the FPGA design. Ye et al. [13] have observed that more than 40% of the total FPGA's logic block area is occupied by SRAM cells. Such huge area overhead results in larger wire length, which leads in larger parasitic capacitance at load. This increased capacitance increases the dynamic power consumption. The most widely used and well accepted SRAM cell is 6T cell [88] (as shown in Figure 5) due to its symmetric structure and larger data storage capacity. The cell has two cross‐coupled inverters which form latch to keep the programmed data intact. Two pass transistors are used to transfer the data from bitline to cell node (write operation) or cell node to bitline (read operation). The actual control of the FPGA is handled by the Q and Qbar outputs. The main drawbacks of the conventional 6T cell are: poor stability, large power consumption, and degraded performance.


Figure 5.

Architecture of the conventional 6T SRAM cell [88].

5.1. Subthreshold SRAM cell

Subthreshold operation is achieved when the device is allowed to operate at power supply (Vdd) lower than its threshold voltage. Using this concept, researchers [8994] have proposed the subthreshold SRAM cells to reduce the overall power consumption in the cell. Teman et al. [95] have designed a robust, low‐voltage SRAM bit cell with reduced 5 transistors compared to the standard 6T circuit. Their designed cell can operate at voltage as low as 400 mV in a commercial 40 nm CMOS process. At this supply voltage, the proposed bit cell provides 6σ stability and an average static power reduction of 21× compared to the 6T cell. The main drawback of the circuit is its extra processing complexity due to HVT and SVT transistors.

Calhoun et al. [90] have proposed 10T subthreshold bit cell (Figure 6). Transistors M1 through M6 forms conventional 6T cell except that the source of M3 and M6 tie to a virtual supply voltage rail (VVDD). The proposed cell has distinct read and write ports to improve the stability of the cell. Eliminating the read SNM problem allows this bitcell to operate at half of the Vdd of a 6T cell while retaining the same 6σ stability. Transistors M7–M10 are used to remove the read SNM problem by buffering the stored data during read operation. M10 is mainly included in the cell to control the leakage current. Their experimental results show that the proposed cell saves 2.5× and 3.8× leakage power at Vdd = 0.6 V and Vdd = 0.4 V at room temperature. This saving is more aggressive (60×) when power supply is scaled down to 0.3 V.


Figure 6.

Architecture of 10‐T subthreshold bitcell [90].

A design of 10T SRAM is proposed by Jiangzheng et al. [96] by employing voltage lowering techniques to effectively control the leakage current in the cell after allowing cell to operate in subthreshold region. The proposed circuit generates a subthreshold read pulse for transferring the data out of the SRAM. The floating write bitlines minimizes write bitline leakage on the cost of degraded stability. Short read bitlines improve read speed and suppress read power on the cost of area overhead.

Kushwah et al. [97] have proposed a single‐ended dynamic feedback control 8T static RAM (SRAM) cell to enhance the static noise margin (SNM) for ultralow power supply. It achieves write SNM of 1.4× and 1.28× as that of isoarea 6T and read‐decoupled 8T (RD‐8T), respectively at 300 mV. The standard deviation of write SNM for 8T cell is reduced to 0.4× and 0.56× as that for 6T and RD‐8T, respectively. The proposed 8T consumes about 0.6× less write power and 0.48× less read power than 6T cell.

5.2. Data‐aware power‐efficient SRAM cell

The main drawbacks of subthreshold cells are poor stability and degraded performance. Besides the cell leakage, the bitline leakage is another dominating factor for power consumption. The overall bitline power consumption is data dependent. Many data‐aware cells have been reported in the literature to control the bitline power consumption [98102]. Chiu et al. [103] have proposed 8T single‐ended subthreshold SRAM with cross‐point data‐aware write operation. In the circuit write operation is performed by traditional write circuit as in 6T cell, whereas 2T stacked read buffer is used for read operation. Due to stack read circuit, leakage current is controlled and stability is improved. The data‐aware cross‐point write operation improves the writeability. The main drawback of the circuit is large voltage swing on bitline during write operation.

A 130 mV SRAM with expanded write and read margins for subthreshold applications was proposed by Chang et al. [104] to reduce the voltage swing on the respective bitlines during write operation. They have used two separate signals SCR and SCL to perform write operation. The proper selected value of these two signals controls the write power consumption after reducing the discharging activity at the bitline. The isolated read circuit improves the stability of the cell on the cost of large parasitic capacitance and resource burden due to two extra signals.

Singh et al. [105] have designed a data aware dynamic 9T SRAM cell to reduce the bitline power consumption. The dynamic nature of the cell flips the data faster at the bitline so that the average discharging activity is reduced. The cell contains nine transistors with isolated read and writes circuits. The write operation is performed using write signal WS. The value of write signal is chosen based on the write operation. The simulation results predicted the 47% lower write power consumption compared to the 6T. They also observed that power saving varies from 42.45 to 61.3% when no peripheral devices are included in the array during hold mode because of lower leakage current from write bitlines and lower discharging activity at RBL. The cell imposes hardware and wiring burden due to extra signal.

The bit‐interleaving‐enabled 8T SRAM architecture is proposed by Wen et al. [106]. The proposed cell features shared data‐aware write structure and utterly eliminates the half‐select disturbance. In their proposed design, shared write and separated read behaviors are implemented by activating horizontal cells and vertical bitlines instead of enabling blocks. They also proposed a reference‐based sense amplifier (SA) to coordinate the column‐selection array to further optimize the area efficiency. The proposed SRAM operates at a frequency of 125 kHz and consumes a total power of 5.1 μW.

5.3. Data‐dependent‐write‐assist dynamic (DDWAD) SRAM cell

Recently, we have designed a power‐efficient SRAM cell [107] by utilizing dynamic data aware concept for write operation and stack effect to control the read leakage current. The architecture of the cell is shown in Figure 7(a). The designed cell has distinct read and write ports with single bitline to improve the overall stability of the cell. To flip the data at the storage node faster without waiting bitline BL to charge/discharge completely we have introduced a write signal WS and broken the latch of the cell (since WL = high). To control the leakage current in read circuit during write operation and hold mode, stack technique is (three series connected OFF transistors in read path) used on the cost of increased delay. The write signal (WS) has been generated according to the data to be stored at Q and Qbar with the help of circuit as shown in Figure 7(b) [107]. During read and hold mode, WS maintains its previous value and latch nature of the cell is restored to keep the stored data intact. The proposed cell and other cells were simulated at layout level using Cadence 6.1 CMOS design rules for 65 nm technology. The large write power saving (Figure 8) is due to no discharging activity at the bitline BL due to high resistive path (NM1 Turns OFF because WS = 0 (write 1 operation)). Similarly, for WS = high, OFF transistor PM1 does not allow any current to flow between Vdd and ground. This causes low voltage at the storage node Q. In both write operations, a small voltage drops at BL results in considerable dynamic power saving. Due to OFF transistors NM4 and NM6 (since RWL = 0 during write operation) in the read path, the leakage current through RBL is restricted.


Figure 7.

(a) Architecture of DDWAD SRM cell. (b) Circuit to generate appropriate WS signal depending on write operation [107].


Figure 8.

Total power consumption in data aware cell [107].

Due to the forbidden discharging of precharged RBL during read 1 operation and stack effect in read path, a considerable power saving is achieved compared to the conventional 6T cell (Figure 9). In hold mode, WS maintains its value due to internal latch. The static power consumption in the proposed cell is lower than the 6T cell and other proposed cells in the literature irrespective of the power supply (Figure 10). The lower static power in the proposed cell [107] is due to lower leakage current through write bitline BL and stack effect in read circuit. During simulation, we observed that the proposed cell shows a nominal variation in static power consumption with temperature, which reflects the robustness of the cell against temperature. The data at the storage nodes maintained strongly at their respective values for power supply range of 300 mV ≤ Vddmin ≤ 400 mV. The proposed cell shows larger immunity toward the statistical variation due to signal WS as discussed in our published paper in detail [107].


Figure 9.

Read power consumption [107].


Figure 10.

Hold leakage power at various power supplies [107].

Although the proposed cell imposes area overhead compared to the conventional 6T cell, it is not a serious threat in FPGA implementation because of lower leakage current through bitline, more number of cells can be connected on a single bitline in the array.

In SRAM‐based FPGA memory accesses are performed with a designed clock and series of interface circuits like row/column decoder, write/read enabled circuit, etc. These peripheral circuits consume a considerable power in the chip. To implement an array using the proposed cell, we have adopted the hierarchical design approach in which instead of giving individual signals (WS, WL, and RWL) to each cell, global signal circuits are used [108]. The main advantage of using the hierarchical design is the use of shorter wires within local blocks, which reduces parasitic capacitances. In this approach, at one time only one block address can be activated which saves considerable power. Each global signal is connected to corresponding local signal through NMOS pass transistor to save the area. The column‐based approach is adopted in which signal WS is routed parallel to write bitline BL. To avoid the column half selected disturbance in the array due to toggle of the signal WS during write operation, we proposed a circuit as shown in Figure 7(b) [107].

5.4. Proposed decoder circuits and sense amplifier

The most important signals that affect the power dissipation in SRAM memory are the address lines, read and write enable circuits, block select, and sense amplifier. To address these concerns, we have designed new architectures for these circuits to reduce the power consumptions. The detail about these circuits is available in our published work [108, 109].

The proposed column decoder circuit is shown in Figure 11 [108], where CLj represents the address of the columns to be selected (j is an integer number). The architecture of the other decoder circuits is explained in Ref. [108]. Since the proposed decoder is implemented without using NAND gates as in the conventional decoder, the number of transistors is reduced to 546 compared to 1939 in the conventional decoder [108]. The reduced number of transistor results in lower parasitic capacitance, which leads to approximately 76% power saving [108]. The proposed WL driver consumes lower power compared to other designs due to the compactness of the circuit.


Figure 11.

Proposed decoder [108].

As we know most of the current will be dissipated in the SRAM cell by sense amplifier. To address this issue we have also designed a single‐ended sense amplifier [109]. The proposed SA (sense amplifier) reduces the power consumption by controlling the leakage current during evaluation/precharge mode. The circuit can be used even at higher temperature with minimum power consumption. The working of the circuit is explained in detail in Ref. [109].

Table 1 gives the comparison of read power consumption in various sense amplifiers. The main reason for lower power consumption in the proposed circuit is due to lower average current during evaluation mode, small voltage drops on RBL, and lower leakage current compared to other circuits [110, 111]. During hold mode, power consumption in the proposed circuit is lower than the other circuits [110, 111] due to gating effect.

Type of circuitRead power consumption (µW)
Read 0Read 1
Ref. [110]26.67459.856
Ref. [111]77.84018.795

Table 1.

Read power consumption in various SA [109].

We have implemented 32Kb SRAM array using the proposed cell and proposed decoder circuits/sense amplifier. The simulation results were compared with ref. [112] array. The results were encouraging in terms of power consumption as seen in Figures 12 and 13, respectively. The lower hold power obtained in the implemented cache is due to write signal WS and stack effect (read path).


Figure 12.

Write power consumption in 32 kb SRAM array.


Figure 13.

Read power consumption in 32 kb array.

The overall reduction in dynamic and static power in the proposed cell, decoder, and sense amplifier make them an ideal choice for the implementation of power‐efficient and reliable SRAM‐based FPGA.

6. Conclusion

The various issues related with the power consumption in FPGA have been discussed in detail with solutions/techniques as presented in the literature. Power gating/clock gating, dual threshold/multithreshold voltage, programmable Vdd, etc. are the important and well‐accepted methods to control the static and dynamic power consumption in the SRAM‐based FPGA. SRAM is the basic component used in the implementation of SRAM‐based FPGA and occupies larger area in the chip and consumes considerable amount of static/dynamic power. The power consumption in the cell can be reduced by reducing the bitline length, designing compact peripheral circuits, or improving the cell at the architecture level. Researchers have proposed subthreshold SRAM cell to reduce the power consumption but it degrades the reliability of the cell. To address dynamic power and static power consumption in the cell, a data aware cell is proposed with isolated write and read ports. Both operations are performed on single bitline. Power‐efficient peripheral circuits like write/read decoder, address decoder circuit, and sense amplifier were also presented in the chapter to realize the SRAM array. The proposed cell and implemented array consume lower overall power due to lower discharging activity at BL and leakage current control due to stack effect. The area overhead in the proposed cell is not a serious threat in the implementation of array because of lower bitline leakage more number of cells can be connected on the same bitline.


1 - W.S. Carter, K. Duong, R.H. Freeman, H.C. Hsieh, J.Y. Ja, J.E. Mahoney, L.T. Ngo, and S.L. Sze, “A user programmable reconfigurable logic array,” in IEEE 1986 Custom Integrated Circuits Conferences, pp. 233–235, 1986.
2 - A. Gupta, V. Aggarwal, R. Patel, P. Chalasani, D. Chu, P. Seeni, P. Liu, J. Wu, and G. Kaat, “A user configurable gate array using CMOSEPROM technology,” in Proceedings of Custom Integrated Circuits Conferences, pp. 31.7.1–31.7.4, 1990.
3 - J. Birkner et al., “A very‐high‐speed field‐programmable gate array using metal‐to‐metal antifuse programmable elements,” Microelectronics Journal, vol. 23, pp. 561–568, 1992.
4 - D. Tavana, W. Yee, S. Young, and B. Fawcett, “Logic block and routing considerations for a new SRAM‐based FPGA architecture,” in Proceedings of Custom Integrated Circuits Conferences, pp. 511–514, 1995.
5 - R. Patel et al., “A 90.7 MHz‐2.5 million transistors CMOS PLD with JTAG boundary scan and in‐system programmability,” in Proceedings of Custom Integrated Circuits Conferences, pp. 507–510, 1995.
6 - P. Chow, S.O. Seo, J. Rose, K. Chung, G. P’aez‐Monz’on, and I. Rahardja, “The design of an SRAM‐based field‐programmable gate array—part I: Architecture,” IEEE Transanctions on Very Large Scale Integration (VLSI) SYSTEMS, vol. 7, no. 2, pp. 191–197, 1999.
7 - P. Graham, M. Caffrey, J. Zimmerman, D.E. Johnson, P. Sundararajan, and C. Patterson, “Consequences and categories of SRAM FPGA configuration SEUs,” Proceedings of the Military and Aerospace Applications of Programmable Logic Devices (MAPLD), Washington DC, September 2003.
8 - C. Bolchini, A. Miele, and C. Sandionigi, “A novel design methodology for implementing reliability ‐ aware system on SRAM based FPGAs”, IEEE Transactions on Computers, vol. 60, no. 12, pp. 1744–1758, 2011.
9 - J. Lamoureux and W. Luk, “An overview of low‐power techniques for field‐programmable gate arrays”, Proceedings of IEEE NASA/ESA Conference on Adaptive Hardware and Systems, pp. 338–345, 2008.
10 - P. Singh and S.K. Vishvakarma, “Device/circuit/architectural techniques for ultra‐low power FPGA design,” Microelectronics and Solid‐State Electronics, vol. 2, no. 2A, pp. 1–15, 2013.
11 - I. Brynjolfson and Z. Zilic, “Dynamic clock management for low‐power applications in FPGAs”, Proceedings of IEEE Custom Integrated Circuits Conference, pp. 139–142, 2000.
12 - K. Shahzad and B. Oelmann, “Investigation of energy consumption of an SRAM‐based FPGA for duty‐cycle applications”, in ParaFPGA2013, Parallel Computing with FPGAs, Munich, Germany, 10–13 September 2013.
13 - A. Ye, J. Rose, and D. Lewis, “Using multi‐bit logic blocks and automated packing to improve field‐programmable data path circuits”, in IEEE International Conference on Field‐Programmable Technology, pp. 129–136, Brisbane, Australia, 2004.
14 - S.E. Wahlstrom, “Programmable Logic arrays — cheaper by the millions,” Electronics, vol. 40, pp. 90–95, 1967.
15 - N. Grover and M.K. Soni, “Reduction of power consumption in FPGAs – an overview,” Information Engineering and Electronic Business, vol. 5, pp. 50–69, 2012.
16 - J. Tarrillo and F.L. Kastensmidt, “Estimating power consumption of multiple modular redundant designs in SRAM‐based FPGAs for high dependable applications,” in 24th International Workshop on Power and Timing Modeling, Optimization and Simulation (PATMOS), 2014.
17 - K. Roy, S. Mukhopadhyay, and H. Mahmoodi‐Meimand, “Leakage current mechanisms and leakage reduction techniques in deep‐submicrometer CMOS circuits,” IEEE Proceeding, vol. 91, no. 2, pp. 305–327, 2003.
18 - P.F. Butzen and R.P. Ribas, “Leakage current in sub‐micrometer CMOS gates,” University of Federal do Rio Grande do Sul, 2005.
19 - Y.‐B. Kim, “Challenges for nanoscale MOSFETs and emerging nanoelectronics,” Transaction on Electrical and Electronic Materials,” vol. 11, no. 3, pp. 93–105, 2010.
20 - N.H.E. Weste, D. Harris, and A. Banerjee, “CMOS VLSI Design a Circuits and Systems Perspective,” 4th Edition, Addison‐Wesley, ISBN 10: 0‐321‐54774‐8, ISBN 13: 978‐0‐321‐54774‐3, UK, 2005.
21 - D. Rittman, “Structured ASIC design: A new design paradigm beyond ASIC, FPGA and SoC”, 2004.
22 - S.M.H. Ho, “Structured ASIC: Methodology and comparison,” Proceedings of 2010 International Conference Field‐Programmable Technology (FPT), pp. 377–380, 2010.
23 - Y. Cai, K. Mai, and O. Mutlu, “Comparative evaluation of FPGA and ASIC implementations of buffer less and buffered routing algorithms for on‐chip networks,” in Proceedings of the International Symposium on Quality Electronic Design (ISQED), pp. 475–484, 2015.
24 - F. Li, Y. Lin, L. He, and J. Cong, “Low‐power FPGA using pre‐defined dual‐Vdd/dual‐Vt fabrics”, in Proceedings of ACM/SIGDA International Symposium on Field Programmable Gate Arrays, pp. 42–50, 2004.
25 - A. Kumar and M. Anis, “Dual‐threshold CAD framework for subthreshold leakage power aware FPGAs,” in IEEE Transactions of Computer‐Aided Design of Integrated Circuits and Systems, vol. 26, no. 1, pp. 53–66, 2007.
26 - R. Jaramillo‐Ramirez and M. Anis, “A dual‐threshold FPGA routing design for subthreshold leakage reduction,” in 2007 IEEE International Symposium on Circuits and Systems, New Orleans USA, pp. 3724–3727, 27–30 May 2007.
27 - S. Bae, R. Krishnan, and N. Vijaykrishnan, “A novel low area overhead body bias FPGA architecture for low power applications,” IEEE Computer Society Annual Symposium on VLSI, pp. 193–198, 2009.
28 - J.H. Anderson, and F.N. Najm, “Active leakage power optimization for FPGAs,” IEEE Transactions on Computer‐Aided Design of Integrated Circuits and Systems, vol. 25, no. 3, pp. 423–437, 2006.
29 - S. Narendra, S. Borkar, V. De, D. Antoniadis, and A. Chandrakasan, “Scaling of stack effect and its application for leakage reduction”, in Proceedings of International Symposium on Low Power Electronic Design (ISLPED), pp. 195–200, 2001.
30 - F. Fallah and M. Pedram, “Standby and active leakage current control and minimization in CMOS VLSI circuits”, IEICE Transactions on Electronics (Special Section on Low‐Power LSI and Low‐Power IP), vol. E88‐C, no. 4, pp. 509–519, 2005.
31 - P.E. Gaillardon, E. Beigne, S. Lesecq, and G. De Micheli, “A survey on low‐power techniques with emerging technologies: From devices to systems”, ACM Journal on Emerging Technologies in Computing Systems, vo. 12, no.2, 2015, pp. 12.1–12.26.
32 - A.A.M. Bsoul and S.J.E. Wilton, “An FPGA architecture supporting dynamically controlled power gating,” in International Conference on Field‐Programmable Technology, ser. FPT'10, pp. 1–8, 2010.
33 - M.K.J. Hussein and M. Hart, “Lowering power at 28 nm with Xilinx 7 series devices,” Xilinx, White Paper, WP389 (v1.2), 2013.
34 - B. Calhoun, F. Honore, and A. Chandrakasan, “Design methodology for fine‐grained leakage control in MTCMOS,” in Proceedings of IEEE International Symposium on Low Power Electronics and Design (ISLPED), 2003.
35 - A. Gayasen, Y. Tsai, N. Vijaykrishnan, M. Kandemir, M. Irwin, and T. Tuan, “Reducing leakage energy in FPGAs using region‐constrained placement”, Proc. ACM/SIGDA Int. Symp. Field Programmable Gate Arrays, pp. 51–58, 2004.
36 - V. George and J. Rabaey, “Low‐Energy FPGAs: Architecture and Design,” Kluwer Publication, New York, 2001.
37 - K. Poon, A. Yan, and S.J.E. Wilton, “A flexible power model for FPGAs”, in Proceedings of Int. Conf. Field Programmable Logic and Applications, pp. 312–321, 2002.
38 - J. Lach, J. Brandon, and K. Skadron. “A general post‐processing approach to leakage current reduction in SRAM‐based FPGAs.” In International Conference on Computer Design, 2004.
39 - R. Ahmed, “Towards High‐Level Leakage Power Reduction Techniques for FPGAs,” PhD Thesis, College of Graduate Studies (Electrical Engineering), University of British Columbia (Okanagan), 2015.
40 - C.Q. Tran, H. Kawaguchi, and T. Sakurai, “The 95% leakage reduced FPGA using zigzag power‐gating, Dual‐VTH/VDD and Micro VDD hopping,” in 2005 Asian Solid‐State Circuits Conference, Hsinchu, pp. 149–152, 2005.
41 - S. Srinivasan, A. Gayasen, and T. Tuan, “Leakage control in FPGA routing fabric”, in Proceedings of Asia South Pacific Design Automation Conference, pp. 661–664, 2005.
42 - M. Hasan, A.K. Kureshi, and T. Arslan. “Leakage reduction in FPGA routing multiplexers,” in 2009 IEEE International Symposium on Circuits and Systems, Taipei, pp. 129–1132, 24–27 May 2009.
43 - C.H. Hoo, Y. Ha, and A. Kumar, “A directional coarse‐grained power gated FPGA switch box and power gating aware routing algorithm”, in Proceedings of 23rd International Conference on Field Programmable Logic and Applications (FPL), pp. 1–4, 2013.
44 - J. Anderson and F. Najm, “Power estimation techniques for fpgas,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 12, no. 10, 2004.
45 - F. Li, Y. Lin, L. He, D. Chen, and J. Cong, “Power Modeling and Characteristics of Field Programmable Gate Arrays, “IEEE Transactions on Computer‐Aided Design Integrated Circuits and Systems, vol. 24, no. 11, pp. 1712–1724, 2005.
46 - J.H. Anderson, “Power optimization and prediction techniques for FPGAs”, Department of Electrical and Computer Engineering, Univeristy of Toronto, 2005.
47 - J.R. Templin and J.R. Hamle, “A new power‐aware FPGA design metrics,” Journal of Cryptographic Engineering, vol. 5, no. 1, pp. 1–11, 2015.
48 - R. Mukundrajan, “Tunnel FET based field programmable gate arrays”, PhD Thesis, The Graduate School, College of Engineering, The Pennsylvania State University, USA, 2011.
49 - M. Abusltan and S.P. Khatri, “A comparison of FinFET based FPGA LUT design,” in Proceeding GLSVLSI'14, 24th Edition of Great Lakes Symposium on VLSI, pp. 353–358, 2014.
50 - M.M. El‐Din, H. Mostafa, H.A.H. Fahmy, Y. Ismail, and H. Abdelhamid, “Performance evaluation of FinFET‐based FPGA cluster under threshold voltage variation,” in 13th International Conference on New Circuits and Systems Conference (NEWCAS), Grenoble, pp. 1–4, 7–10 June 2015.
51 - A. Davidson, “A new FPGA architecture and leading‐edge FinFET process technology promise to meet next generation system requirements,” High‐End FPGA Products, San Jose, CA, June 2015.
52 - W. Hung, Y. Xie, N. Vijaykrishnan, M. Kandemir, M.J. Irwin, and Y. Tsai, “Total power optimization through simultaneously multiple‐VDD multiple‐VTH assignment and device sizing with stack forcing,” ISLPED'04, Newport Beach, California, USA, August 9–11, 2004.
53 - H.S. Deogun, R. Senger, D. Sylvester, R. Brown, and K. Nowka, “A dual‐VDDboosted pulsed bus technique for low power and low leakage operation,” in ISLPED'06 Proceeding of the 2006 International Symposium on Low Power Electronics and Design, pp. 73–78, 2006.
54 - K. Kim and V.D. Agrawal, “Ultra low energy CMOS logic using below‐threshold dual‐voltage supply,” Journal of Low Power Electronics, vol. 7, pp. 1–11, 2011.
55 - A. Gayasen, K. Lee, N. Vijaykrishnan, M. Kandemir, M. Irwin, and T. Tuan, “A dual‐Vdd low power FPGA architecture”, in Proceedings of International Conference on Field Programmable Logic and Applications, pp. 145–157, 2004.
56 - F. Li, Y. Lin, H. Lei, and J. Cong, “Low‐power FPGA using pre‐defined dual‐Vdd/Dual‐Vt FABRICS”, in FPGA'04, Monterey, California, USA, 22–24February 2004.
57 - R. Mukherjee and S. Ogrenci, “Mimic evaluation of dual VDD fabrics for low power FPGAs,” in Proceedings of Asia and South Pacific Design Automation Conference, pp. 1240–1243, 2005.
58 - F. Li, Y. Lin, and L. He, “Vdd programmability to reduce FPGA interconnect power”, in Proceedings of International Conference on Computer‐Aided Design, pp. 760–765, 2004.
59 - F. Li, Y. Lin, and L. He, “Field programmability of supply voltages for FPGA power reduction”, IEEE Transactions on Computer‐Aided Design of Integrated Circuits and Systems, vol. 26, no. 4, pp. 752–764, 2007.
60 - Y. Meng, T. Sherwood, and R. Kastner, “Leakage power reduction of embedded memories on FPGAs through location assignment”, in DAC 2006, July 24–28, San Francisco, California, USA, 2006.
61 - I. Ashraf, F. Boccardi, and L. Ho, “Alcatel‐Lucent, SLEEP mode techniques for small cell deployments”, in IEEE Communications Magazine, pp. 72–79, August 2011.
62 - M. Lin and A. El Gamal, “A low‐power field‐programmable gate array routing fabric”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 17, no. 10, pp. 1481–1494, 2009.
63 - S.D. Pable and M. Hasan, “Performance analysis of FPGA interconnect fabric for ultra‐low power applications”, in ICCCS'11, Rourkela, Odisha, India, 12–14 February 2011.
64 - K. Siozios, F. Pavlidis, and D. Soudris, “A novel framework for exploring 3‐D FPGAs with heterogeneous interconnect fabric”, ACM Transactions on Reconfigurable Technology and Systems, vol. 5, no. 1, article 4, pp. 4:1–4:23, 2012.
65 - J. Anderson and F. Najm, “A novel low‐power FPGA routing switch”, in Proceedings of IEEE Custom Integrated Circuits Conference, pp. 719–722, 2004.
66 - T. Gao, K.C. Chen, J. Cong, Y. Ding, and C.L. Liu, “Placement and placement driven technology mapping for FPGA synthesis”, in Proceedings of IEEE International ASIC Conference, pp. 87–91, 1993.
67 - E. Bozorgzadeh, S.O. Memik, X. Yang, and M. Sarrafzadeh, “Routability‐driven packing: Metrics and algorithms for cluster‐based FPGAs,” Journal of Circuits, Systems and Computers, vol. 13, no. 1, pp. 77–100, 2004.
68 - M. Xu and F.J. Kurdahi, “Layout‐driven high level synthesis for FPGA based architectures”, in Proceeding of the Conference on Design, Automation and Test in Europe, DATE'98, pp. 446–450, 1998.
69 - D.P. Singh and S.D. Brown, “Incremental placement for layout‐driven optimizations on FPGAs”, Proc. Int. Conf. Comput.‐Aided Des., pp. 752–759, 2002.
70 - V. Betz and J. Rose, “Circuit design, transistor sizing and wire layout of FPGA interconnect”, IEEE 1999 Custom Integrated Circuits Conference, pp. 171–174, 1999.
71 - N. Kapre, N. Mehta, M. deLorimier, and R. Rubin, “Packet switched vs. time multiplexed FPGA overlay networks”, in IEEE Symposium on Field‐Programmable Custom Computing Machines (FCCM 2006), 24–26 April 2006.
72 - I. Kuon, R. Tessier, and J. Rose, “FPGA architecture: Survey and challenges”, Foundations and Trends in Electronic Design Automation, vol. 2, no. 2, pp. 135–253, 2007.
73 - R. Seelam, “I/O design flexibility with the FPGA mezzanine card (FMC)”, White Paper, WP315 (v1.0), pp. 1–7, 19 August 2009.
74 - C.C. Tsang and H.K.‐H. So, “Reducing dynamic power consumption in FPGAs using precomputation”, Proceedings of International Conference on Field Programmable Technology (FPT 2009), December 2009.
75 - J. Lamoureux, G. Lemieux, and S. Wilton, “GlitchLess: Dynamic power minimization in FPGAs through edge alignment and glitch filtering”, IEEE Transactions on Very Large Scale Integrated Systems, vol. 16, no. 11, pp. 1521–1534, 2008.
76 - C. Ravishankar, J.H. Anderson, and A. Kennings,”FPGA power reduction by guarded evaluation considering logic architecture”, IEEE Transactions on Computer‐Aided Design of Integrated Circuits and Systems, vol. 31, no. 9, pp. 1305–1318, 2012.
77 - K. Subraniyam, “Proven power reduction with Xilinx ultrascale FPGAs”, White Paper, WP466, vol. 1.1, pp. 1–13, 15 October 2015.
78 - C.Q. Tran, H. Kawaguchi, and T. Sakurai, “More than two orders of magnitude leakage current reduction in look‐up table for FPGA's”, IEEE International Symposium on Circuits and Systems, vol. 5, pp. 4701–4704, 23–26 May 2005.
79 - T. Tuan and B. Lai, “Leakage power analysis of a 90 nm FPGA”, in IEEE Custom Integrated Circuits Conference, pp. 57–60, San Jose, CA, 2003.
80 - D. Curd, “Power consumption in 65nm FPGAs”, White Paper: Virtex‐5 FPGAs, WP 246, vol. 1.2, pp. 1–12, February 2007.
81 - V. Rozic, W. Dehaene, and I. Verbaushede, “Design solutions for securing SRAM cell against power analysis”, in Symposium on Hardware‐Oriented Security and Trust (HOST), pp. 122–127, 3–4 June 2012.
82 - J. Lach, J. Brandon, and K. Skadron, “A general post‐processing approach to leakage current reduction in SRAM‐based FPGAs”, in Proceedings of the IEEE International Conference on Computer Design (ICCD'04), pp. 144–150, 11–13 October 2004.
83 - M. Qazi, M.E. Sinangil, and A.P. Chandrakasan, “Challenges and directions for low‐voltage SRAM”, in IEEE Design & Test of Computers, pp. 32–43, January/February 2011.
84 - Z. Liu and V. Kursun, “Characterization of a novel nine‐transistor SRAM cell”, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 16, no. 4, pp. 488–492, 2008.
85 - K. Blomster and J.G. Delgado‐Frias, “Reducing power and delay in memory cells using virtual source transistors,” in 48th IEEE International Midwest Symposium on Circuits and Systems, pp. 299–302, 2005.
86 - T. Tuan, S. Kao, A. Rahman, S. Das, and S. Trimberger, “A 90 nm low‐power FPGA for battery‐powered applications”, 14th International Conference on Field Programmable Gate Arrays, FPGA'06 Proceedings, pp. 3–11, 2006.
87 - B. Wang, T.Q. Nguyen, A.T. Do, J. Zhou, M. Je, and T. Tae‐Hyoung Kim, “Design of an Ultra‐low Voltage 9T SRAM With Equalized Bitline Leakage and CAM‐Assisted Energy Efficiency Improvement”, IEEE Transactions on Circuits and Systems—I: Regular Papers, vol. 62, no. 2, pp. 441–448, 2015.
88 - L. Zhang, C.‐H. Chang, Z.H. Kong, and C.Q. Liu, “Statistical analysis and design of 6T SRAM cell for physical unclonable function with dual application modes”, in IEEE International Symposium on Circuits and Systems (ISCAS), Lisbon, pp. 1410–1413, 24–27 May 2015.
89 - B.H. Calhoun and A.P. Chandrakasan, “Static noise margin variation for sub‐threshold SRAM in 65‐nm CMOS”, IEEE Journal of Solid‐State Circuits, vol. 41, no. 7, pp. 1673–1679, 2006.
90 - B.H. Calhoun and A.P. Chandrakasan, “A 256‐kb 65‐nm sub‐threshold SRAM DESIGN for ultra‐low‐voltage operation”, IEEE Journal of Solid‐State Circuits, vol. 42, no. 3, pp. 680–688, 2007.
91 - M.‐T. Chang and W. Hwang, “A fully‐differential subthreshold SRAM cell with auto‐compensation,” in IEEE Asia Pacific Conference on Circuits and Systems, APCCAS, Macao, pp. 1771–1774, 30 Nov–3 Dec. 2008.
92 - M.‐H. Tu, J.‐Y. Lin, M.‐C. Tsai, S.‐J. Jou, and C.‐T. Chuang, “Single‐ended subthreshold SRAM with asymmetrical write/read‐assist”, IEEE Transactions on Circuits and Systems‐I: Regular Papers, vol. 57, no. 12, pp. 3039–3047, 2010.
93 - L. Ming, C. Hong, L. Changmeng, and W. Zhihua, “An ultra‐low‐power 1 kb sub‐threshold SRAM in the 180 nm CMOS process”, Journal of Semiconductors, vol. 31, no. 6, pp. 065013‐1–065013‐4, 2010.
94 - B. Amelifard, F. Fallah, and M. Pedram, “Reducing the sub‐threshold and gate‐tunneling leakage of SRAM cells using dual‐Vt and dual‐Tox assignment”, in Proceedings of DATE, pp. 1–6, 2006.
95 - A. Teman, A. Mordakhay, J. Mezhibovsky, and A. Fish, “A 40 nm sub‐threshold 5T SRAM bit cell with improved read and write stability”, IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 59, no. 12, pp. 873–877, 2012.
96 - C. Jiangzheng, Z. Sumin, Y. Jia, S. Xinchao, C. Liming, and H. Yong, “A 320 mV, 6 kb subthreshold 10T SRAM employing voltage lowering techniques”, Journal of Semiconductors, vol. 36, no. 6, pp. 065007‐1–065007‐6, 2015.
97 - C.B. Kushwah and S.K. Vishvakarma, “A single‐ended with dynamic feedback control 8T subthreshold SRAM cell”, IEEE Transactions on Very Large Scal Integration (VLSI) Systems, vol. 24, no. 1, pp. 373–377, 2016.
98 - M.‐F. Chang, J.‐J. Wu, K.‐T. Chen, and H. Yamauchi, “A differential data aware power‐supplied (D2AP) 8T SRAM cell with expanded write/read stabilities for lower VDDmin applications”, Symposium on VLSI Circuits, Kyoto, Japan, pp.156–157, 16–18 June 2009.
99 - N. Gong, S. Jiang, A. Challapalli, S. Fernandes, and R. Sridhar, “Ultra‐low voltage split‐data‐aware embedded SRAM for mobile video applications”, IEEE Transactions on Circuits and Systems‐II: Express Briefs, vol. 59, no. 12, pp. 883–887, 2012.
100 - C.M.R. Prabhu and A.K. Singh, “Novel eight transistor SRAM cell for write power consumption,” IEICE Electronics Express (ELEX), vol. 7, no. 16, pp. 1175–1181, 2010.
101 - C.M.R. Prabhu and A.K. Singh, “Low‐power fast (LPF) SRAM cell for write/read operation,” IEICE Electronics Express, vol. 6, no. 18, pp. 1473–1478, 2011.
102 - Y.‐W. Lin, H.‐I. Yang, M.‐C. Hsia, Y.‐W. Lin, C.‐H. Chen, C.‐T. Chuang, W. Hwang, N.‐C. Lien, K.‐D. Lee, W.‐C. Shih, Y.‐P. Wu, W.‐T. Lee, and C.‐C. Hsu, “A 55nm 0.5V 128Kb cross‐point 8T SRAM with data‐aware dynamic supply write‐assist”, in Proceedings of IEEE International SoC Conference (SOCC), pp. 218–223, 12–14 September 2012.
103 - Y.‐W. Chiu, J.‐Y. Lin, M.‐H. Tu, S.‐J. Jou, and C.‐T. Chuang, “8T single‐ended sub‐threshold SRAM with cross‐point data‐aware write operation,” in Proceedings of IEEE ISLPED, August 2011.
104 - M.‐F. Chang, S.‐W. Chang, P.‐W. Chou, and W.‐C. Wu, “A 130 mV SRAM with expanded write and read margins for subthreshold applications”, IEEE Journal of Solid‐State Circuits, vol. 46, no. 2, pp. 520–529, 2011.
105 - A.K. Singh, M.M. Seong, and C.M.R. Prabhu, “A data aware (DA) 9T SRAM cell for low power consumption and improved stability”. International Journal of Circuit Theory and Applications, vol. 42, no. 9, pp. 956–966, September 2014.
106 - L. Wen, X. Cheng, K. Zhou, S. Tian, and X. Zeng,” Bit‐interleaving‐enabled 8T SRAM with shared data‐aware write and reference‐based sense amplifier”, IEEE Transactions on Circuits and Systems—II: Express Briefs, vol. 63, no. 7, pp. 643–647, 2016.
107 - A.K. Singh, M.‐S. Saadatzi, and C. Venkataseshaiah, “Design of a single‐ended energy efficient data‐dependent‐write‐assist dynamic (DDWAD) SRAM cell for improved stability and reliability”, Accepted for the publication in Analog Integrated Circuits and Signal Processing. C. Analog Integr Circ Sig Process (2016).
108 - A.K. Singh, M.‐S. Saadatzi, and C. Venkataseshaiah, “Design of peripheral circuits for the implementation of memory array using data‐aware (DA) SRAM cell in 65 nm CMOS technology for low power consumption”, Journal of Low Power Electronics, vol. 12, pp. 1–12, 2016.
109 - A.K. Singh, M.M. Seong, and C.M.R. Prabhu, “Low power and high performance single‐ended sense amplifier”, Journal of Circuits, Systems, and Computers (Published by World Scientific), vol. 22, no. 7, pp. 1350062‐1–1350062‐12, 2013.
110 - L. Jiang, W. Xueqiang, W. Qin, W. Dong, Z. Zhigang, P. Liyang, and L. Ming, “A low voltage, sense amplifier for high‐performance embedded fash memory”, Journal of Semiconductor, vol. 31, pp. 1–5, 2010.
111 - H.‐I. Yang, M.‐H. Chang, S.‐Y. Lai, H.S.‐F. Wang, and W. Hwang, “A low‐power low swing single‐ended multi‐port SRAM”, in International Symposium VLSI Design, Automation and Testing 2007, VLS‐DAT 2007, Hsinchu, pp. 1–4, May 2007.
112 - A. Teman, L. Pergament, O. Cohen, and A. Fish, “A 250 mV 8 kb 40 nm ultra‐low power 9T supply feedback SRAM (SF‐SRAM)”, IEEE Journal of Solid State of Circuits, vol. 46, no. 11, pp. 2713–2726, October 2011.