Synthesis and VHDL Implementation of fuzzy logic conroller for dynamic voltage and frequency scaling (DVFS) goals in digital processors

The concept of power consumption is becoming the primary concern in modern high performance processors, and in digital circuits and system on chips (SoCs). While CMOS technology has been scaling towards smaller feature sizes, the performance of digital systems has been exponentially increasing as clock frequency increases. Also the computational workload and hence the activity of a digital circuit may change substantially and it exposes a lot of breakthroughs in the exploitation of adaptive low power methodologies. Dynamic voltage and frequency scaling (DVFS) is a popular system level power management technique that dynamically scales the supply voltage and clock frequency level of device (Rabaey, 2010). A DVFS system can be considered as a closed loop control system: contingent on the observed workload, supply voltage and operational speed gets adjusted. Since the changes in supply voltages do not occur instantaneously due to the fact that some delays are involved the large capacitance on the supply rails, the main real challenge in the design of such a system lies in how to measure and predict the workload of processor to change supply voltage accurately. The efficiency of DVFS strongly depends on the accuracy of the workload estimation, and note that misestimating can substantially reduce the effectiveness of such closed loop systems.


Introduction
The concept of power consumption is becoming the primary concern in modern high performance processors, and in digital circuits and system on chips (SoCs). While CMOS technology has been scaling towards smaller feature sizes, the performance of digital systems has been exponentially increasing as clock frequency increases. Also the computational workload and hence the activity of a digital circuit may change substantially and it exposes a lot of breakthroughs in the exploitation of adaptive low power methodologies. Dynamic voltage and frequency scaling (DVFS) is a popular system level power management technique that dynamically scales the supply voltage and clock frequency level of device (Rabaey, 2010). A DVFS system can be considered as a closed loop control system: contingent on the observed workload, supply voltage and operational speed gets adjusted. Since the changes in supply voltages do not occur instantaneously due to the fact that some delays are involved the large capacitance on the supply rails, the main real challenge in the design of such a system lies in how to measure and predict the workload of processor to change supply voltage accurately. The efficiency of DVFS strongly depends on the accuracy of the workload estimation, and note that misestimating can substantially reduce the effectiveness of such closed loop systems. conventional PID controllers and their variants e.g. PI controller or I controller. These kinds of configurations need offline profiling to tune the coefficients of the controller to be able to track or predict the workload variations. However, for different shapes of workload variations, it is necessary to do the off-line tuning again and determine the coefficients another time. So they are not considered as general solutions for adjusting voltage and frequency. Some other estimation methods e.g. adaptive filters have proposed for predicting workload and controlling the behavior of the power and energy savings. Unfortunately, most of them need offline profiling and/or the applications are limited to some specific periodic workload variations.
In this chapter, we discuss an on-line adaptive fuzzy logic controller for DVFS that is able to accurately and robustly predict and track the workload variations even when those variations are highly nonstationary or soft. Furthermore, we describe comprehensively how one can build the controller in VHDL and use it as the power management controller unit. We propose a new method to use for the defuzzification part of the fuzzy controller that makes the circuit faster. The fuzzy controller can be applicable to different kinds of workload variations with regards to real-time constraints, and can adaptively change the supply voltage and frequency of a processor. The proposed controller can be easily upgraded by adding new rules or adding new features to improve performance. In this chapter, all the practical limitations and real-time constraints for designing the fuzzy logic controller as the DVFS method will be discussed during design procedure.

Related works over power management techniques
So far a lot of research has been done to explore different approaches for performing DVFS. Many of the previous works are categorized in task level algorithms that use offline profiling to obtain the average-case execution time (ACET) or worst-case execution time (WCET) as models for the workloads of the given application. For example, one of the earliest works was presented in (Yao et al., 1995) where they assumed that the arrival time, deadline of workloads, and task execution time based on CPU cycle are given to designers as constants. The works proposed in (Im et al., 2006) and (Jejurikar & Gupta, 2006) are two more examples where in (Im et al., 2006) they proposed a technique to reduce the energy consumption based on WCET workload model using buffers; and in (Jejurikar & Gupta, 2006) a dynamic voltage scaling (DVS) method in the presence of task synchronization based on WCET workload model in multiprocessor environment was proposed. This kind of approaches cannot deal with the time-varying workload especially when the workload shows a large variation with nonstationary property. Another category of researches related to DVFS comprise techniques which require either application or compiler support to perform (Azevedo et al., 2002;Yang et al., 2001;Chung et al., 2002). Generality and offline profiling for different workload variations is still a big drawback existing in these classes of works.
Using adaptive approaches for DVFS leads to save more power and energy in comparison to the conventional techniques. Proposing closed loop system architectures started by introducing self-timed adaptive supply-voltage scaling for asynchronous circuits in (Nielsen et al., 1994) where in their architecture, first input-first output (FIFO) buffers are used in both inputs and outputs of the processor. The FIFO-buffers average the computational workload to adjust the supply voltage and frequency. The feedback is based on actual path delays of the circuit. The feedback signal controls the DC-DC converter based on the information derived from the FIFO's. After this architecture, other researchers used a similar configuration to adapt the power supply voltage to lower the power consumption in digital signal processors DSPs (Gutnik & Chandrakasan, 1997). The structure of the proposed system architecture in (Gutnik & Chandrakasan, 1997) is the same as (Nielsen et al., 1994), but they designed a configuration for synchronous designs, and variations in the www.intechopen.com computational workload were taken into account as well. Like the self-timed variable voltage system of (Nielsen et al., 1994), input data is buffered into a FIFO type of buffer to enable averaging of the workload. Then, the control loop controls the processing rate to avoid queuing overflow and underflow of the FIFOs. The controller in this methodology consists of a voltage regulator, a ring oscillator, a rate-compare block and a programmable look-up table (LUT). The controller block decides to change the voltage and frequency based on the processing rate and existent LUT. Disadvantage of this configuration is that it comes with extra latency as buffer utilization is only one measure of workload.
An evolution to closed loop control configuration, in the estimation of workload and adaptive control methods become main challenge for design efficient DVFS. These techniques aimed at estimating time varying workload using adaptive filters that most of them were based on a conventional proportional, integral and derivative (PID) controller. One famous PID based approach is presented in (Wei & Horowitz, 2003) where voltage samples are used to control a VCO to change the frequency as feedback signal for the buck converter. The reference signal and feedback signal come into the controller as variable frequency clocks, both feed into counters, and the number of transitions is counted for a fixed period of time. A PID controller, based on the calculated error value between its inputs, decides to change the voltage and frequency of the circuit. Some example of using PID controller to estimate workload variations are proposed in (Hughes and Adve, 2003;Gu & Chakraborty, 2008;Wu et al., 2005;Lu et al., 2002Lu et al., , 2003. In (Hughes and Adve, 2003), the PID controller is used to estimate the frame decoding time in multimedia applications and it was used in (Gu & Chakraborty, 2008) for 3-D games. In (Wu et al., 2005) and (Lu et al., 2003), the PI controller, which is a variant of the PID method, is applied to estimate the buffer occupancy for DVS targeting data buffered systems. In (Lu et al., 2002), an integral controller which is also a variant of PID method is designed and used to estimate workload estimation for performing DVS. Despite the PID controller is an adaptive filter, it suffers from possible overshooting and undershooting, depending on the selected coefficients. Also the PID controller for estimating application is useful when the designer select coefficient for specific workload variations and if the shape of workload changes, the coefficient should be defined again based on new workload variations. Hence, the tuning of coefficients critically determines the prediction accuracy.
Beside PID based controllers, some other estimators and adaptive filters are proposed to forecast workload variations in order to adjust voltage and/or frequency. In (Sinha & Chandrakasan, 2001), an adaptive approach for dynamic voltage scheduling on processors is presented based on workload prediction by filtering a trace history. In this work, they examine some conventional filters and evaluate their accuracy based on power saving amounts. They concluded that adaptive LMS filtering is the most powerful one and can be used to predict workload variations. Also in (Bang et al., 2009), they proposed a Kalmanfilter based on-line estimator to predict and track the workload variation that can be applicable to periodic applications with soft real-time constrains.
Compared to previous approaches, the fuzzy logic controller is similar to adaptive filters in the sense of estimating the workload variations. However, the fuzzy logic controller can work as an on-line methodology without updating any parameters during run-time adaptations and also without any other information about the nature of the workload www.intechopen.com variation. The controller can estimate and track any kind of workload variations accurately and it does not require any coefficient tuning through offline profiling.

Principles of Dynamic Voltage and Frequency Scaling (DVFS)
The most important key to save power and energy of a digital circuit or processor is to reduce supply voltage and clock frequency according to the performance requirements. The power consumption of a clocked digital CMOS circuit is given by the well-known formula: where . is the total switched capacitance, is the supply voltage, is the clock frequency and is the off-state current of the circuit. By reducing the supply voltage and clock frequency, considerable power can be saved while . is generally fixed for a specific application. Over the years, researchers have proposed different hardware adaptive power management infrastructures to construct low power system-on-chip (SoC) integrated circuits. Among all the methods, DVFS methods are the most effective ones to save power consumption in processors. Conceptually, online DVFS problem for a digital CMOS circuit e.g. a processor is to scale voltage and frequency based on performance variation demands. The general block diagram of a dynamic supply voltage and frequency scaled system is shown in Fig. 1 (Nielsen et al., 1994;Gutnik & Chandrakasan, 1997).

Fig. 1. General block diagram of a DVFS system
In this block diagram, there are three main components. The first component is a performance sensor that monitors the main specification of the processor e.g. average of supply current, temperature and supply voltage variations. The second component is the controller. This controller block works based on an input data received from the sensors by comparing it with the reference performance received from the power management unit or software to decide the change in supply voltage when necessary. The third block is the supply voltage actuator that can be on-chip or off-chip, e.g. a DC-DC converter and clock frequency actuators that can be a PLL. Since reducing the supply voltage causes increasing the delay of circuits, controlling the voltage and frequency of a processor dramatically depends on the accuracy of the controller.

www.intechopen.com
Since there is a strong correlation between the supply current and the workload of a processor (Benini et al., 1999), the controller is designed based on observing and tracking of the average of current variations. The most important purpose is how to predict and track supply current variations of the processor and to drive it to operate at the lowest possible voltage and corresponding minimum frequency, for which a specific application can meet all of its deadlines under specific timing constraints. If the supply current tracking can perform in a proper way, the supply voltage and clock frequency of the processor can be adjusted w.r.t output predicted current signal. Supply voltage variations are same with variations of the predicted supply current signal. For determining clock frequency in each control time, the proper look up table corresponding to the delay-voltage model can be used. The delay of a CMOS gate can be modeled as = ( 2) where and are technological parameters, and is device threshold voltage. The cycle time of a design is modeled as a function of the critical path delay given as = where is the logic depth in number of (equivalent) gates in the critical path. Therefore, the clock frequency for satisfying all timing deadlines of the circuit can be determined as The relation of the normalized operating frequency versus normalized supply voltage of a sample CMOS digital circuit is shown in Fig. 2. As mentioned before, changing the processor clock frequency can be done by the available PLL in the circuit. PLL can only provide some limited clock frequencies, for instance suppose that a sample PLL can provide six different clock frequencies, like , ,…, shown in Fig. 2. Imagine a specific application is running with a constant frequency at its nominal supply voltage without any voltage scaling. Now suppose that the supply current is such that there is opportunity to save power by reducing the supply voltage. However, observe that when the supply voltage reduces (e.g. to a point between and as shown in Fig. 2), the frequency of operation would reduce as well to . If the supply www.intechopen.com voltage goes for a value between and , then the frequency can switch to the value. In this way, adjusting supply voltage to the lowest allowable value together with frequency scaling will ensure that the application is properly executed and the maximum possible power is saved. For switching the supply voltage to different possible values, it is needed to use voltage actuators like on-chip or off-chip DC-DC converters. In most DC-DC converters as voltage regulators, switching between voltage output levels takes a few tens of microseconds. For doing safe voltage and frequency switching, voltage and clock frequency changes may not be done in parallel. While the supply current is going to decrease, the frequency should first be decreased and subsequently the voltage should be lowered to the appropriate value. On the contrary, when the supply current is going to increase, the circuit requires the voltage to be increased first followed by the frequency update. This ensures that the voltage supply to the processor is never lower than the minimum required for the current operating frequency and avoids data corruption due to circuit failure.

DVFS based on fuzzy logic controller
The block diagram of the proposed dynamic voltage and frequency scaling configuration is shown in Fig. 3 (Pourshaghaghi & Pineda de Gyvez, 2009). In this block diagram, the supply current and also the derivative of the supply current are observed as two inputs of the fuzzy logic block. The reason for using the derivative of the supply current is that it helps to predict the variations of the workload. If one can predict variations of the supply current, then it is easier for the actuators to act sooner. Consequently, the amount of saved power can be increased significantly, not to mention finishing the executing task on time. Given a specific value for the supply current, if the derivative is positive, it implies that the supply current is increasing. Otherwise, the supply current is decreasing. Therefore, the fuzzy if-then rules should be defined to follow this concept. It should be taken into account that, for having more precision to predict supply current variations, it is possible to compute the second derivative of the current. Thus, the fuzzy logic block receives two inputs: supply-current and its derivative. Based on these two inputs, the fuzzy logic block, as an expert system, can decide about the voltage and www.intechopen.com frequency of the processor. Actually, by this method, the fuzzy logic controller is tracking the supply current to decide upon the new voltage of the digital circuit. Actuators for supply voltage can be an on-chip or off-chip DC-DC converters. The same procedure can be done for determining the frequency of the processor. But for deciding about the final frequency value, it should be taken into account that the frequency obtained by fuzzy logic controller has to be greater than the frequency obtained by worst case execution time. Also the frequency can be defined based on a proper predefined look up table.
Based on performing different experiments, the proposed internal structure of fuzzy controller was resulted to have membership functions and fuzzy rules like ones shown in Fig. 4. In this structure, if membership functions are defined for the supply current and 3 membership functions are defined for its derivative, then × rules should define the fuzzy logic rule-base block. The rules should be defined in a way that the supply voltage tracks the variations of the supply current. Therefore, the proposed controller predicts first the supply current variations and then it decides on how to change the voltage and frequency pair. Using fuzzy logic sets, the fuzzy inference system (FIS) formulates the process of getting the output based on the defined input membership functions and the fuzzy if-then rules. Mamdani FIS is the most commonly useful methodology for applying fuzzy logic controllers on practical systems and we recommend using it for DVFS goals (Lee, 1999). Several experiments have been conducted to evaluate different aspects of the controller. In the first simulation, we designed a controller and implemented it on a sampled supply current of a processor near to reality. This supply current is shown in Fig. (5.a) and its www.intechopen.com derivative is shown in Fig. (5.b). Based on the internal fuzzy system structure described in Fig. 4, we have considered nine Triangular membership functions for supply current. These functions are defined between to , without losing the generality, with a symmetrical shapes and widths. Each supply current membership function has % overlaps with its neighbor membership function (functions). We have considered five Triangular membership functions for derivative of supply current from to . Consequently, we defined 27 if-then rules based on the rules shown in Fig. 4. Nine symmetrical triangular membership functions for supply voltage have been considered as well. These membership functions have % overlap with each other and have same widths too. We used also the centre of area as the defuzzification method. The result of this simulation is shown in Fig. (5.c). As one can see from the supply voltage values, the fuzzy logic can track the variation of supply current very well. The output surface of fuzzy logic controller is shown in Fig. (5.d). In this figure, the entire span of supply voltage based upon the entire span of supply current and its derivative is displayed. It shows pseudo continuity of the output voltage with variations of workload. We simulated a PID controller on another supply current signal and compared the results with the fuzzy controller. Suppose that we have a supply current signal like the one shown in Fig. (6.a). We trained the PID controller with some simulation testing to find out what coefficients are the best for the proportional, integration and derivative part of the controller. Finally, with a trial and error method we found that with = , = and = , it can track the supply current very well. The tracking result is shown in Fig. (6.b). But when the shape of the supply current changed, similar to the supply current shown in Fig. (6.c), the PID could not track the variations with the same coefficients and we have to change coefficients again. The output of PID block in the second experiment is shown in Fig.  (6.d). It is also important to mention that the fuzzy logic controller works well regardless of the system's inputs, while the PID controller requires the mathematical formulation of the system to adapt its coefficients to be able to work properly. One of the main advantages of the fuzzy logic controller is that the hardware implementation is easy because everything here is digital. Another advantage is that this controller can work on-line to track all workload circumstances with high speed and less error in comparison with other traditional control methods.

VHDL implementation of the fuzzy logic controller
The general architecture of the fuzzy logic controller to track supply current variations of a processor is shown in Fig. 7 where the information flows from left to right. The fuzzy logic controller is designed based on the Mamdani fuzzy inference system (FIS). The first step to implement the controller as a digital circuit is to convert analogue input values, supply current and its derivative, to digital ones. For this purpose an analogue to digital converter (A/D) is necessary to digitize the input crisp values. The resolution of the selected A/D depends on the desired accuracy for supply current, derivative of supply current and supply voltage data. For example, suppose that the supply current variations of a processors change between 0mA and 100mA and one has selected an 8 bit A/D. In this case, the resolution of the supply current samples is as follows: Hence if a voltage actuator e.g. a DC-DC converter has been selected to regulate the processor's voltage between 0.7V and 1.2V, the output supply voltage made by the fuzzy logic controller has steps of 1.95 mV for supply voltage. In this section, we design the controller in VHDL based on an 8 bits resolution for digital values, and without loss of generality one can extend the design to other resolutions.

Implementation of the fuzzification stage
After digitizing the input crisp values, the first step is to define membership functions for the current, derivative of the current and the supply voltage. We consider nine membership functions for the supply current variations, three membership function for its derivative, and nine membership functions for the output supply voltage. The numbers of the membership functions are obtained based on executing different experiments and evaluating the accuracy of the controller with different supply current signatures. These functions are defined in the triangular shapes like the ones presented in Fig. 4 combined with the same corresponding table of fuzzy if-then rules. First we start to design the membership functions (MFs) of the supply current. Since we have used an 8 bits resolution for A/D, the input range of current should map between 0($0) and 255 ($FF). Consider defined MFs of the current as the ones shown in Fig. 8. In this figure, the Y axis shows the degree of membership function as a value between 0 and 1 and the X axis shows the supply current universe of discourse. All these parameters need to be mapped between 0 and 255. Each MF in Fig. 8 is represented by four parameters: point1 (P1), the positive slope value, point2 (P2), and the negative slope value.

www.intechopen.com
For each current value as the input, the degree of membership function (dmf) depends on the location of the current value regarding to these four parameters. The pseudo code to calculate the degree of membership function for a specific input current value is presented in Algorithm 1. In this pseudo code, it is supposed that the slopes of all triangular membership functions have value 8. With this assumption, one can avoid using multipliers in the circuit to calculate the degree of membership functions and increase the speed of the circuit.

Rule evaluation: Implementation of fuzzy inference system
Considering Fig. 7 and since there are 9 MFs for the current and 3 MFs for its derivative, 27 fuzzy if-the rules are defined to correspondingly calculate the fuzzy supply voltage output values. The Mamdani FIS is used to evaluate the fuzzy if-then rules. To design Mamdani FIS in VHDL, let's consider the first fuzzy if-the rules defined in Fig. 7: IF the supply current belongs to I(1) AND the derivative of current belongs to P (Positive) Then voltage is VDD(2).
For this rule, the AND operator should be applied to obtain one value out of two degree of membership functions (current and its derivative) which represents the result of the antecedent part in rule 1. This value actually represents a weighting factor for this specific rule. In the Mamdani FIS, the AND operator is a fuzzy operator to find the minimum value between two degrees of MFs: current and its derivative. The following minimum function, Algorithm 2, is used to implement the fuzzy AND operator in VHDL. In Fig. 7, this stage is called product layer which is a part of the min-max Mamdani FIS. -Note: In this function, a represents the dmf_I (degree of membership function of the current) and b represents dmf_di.

www.intechopen.com
The output value of the minimum function is a value that determines the degree of MF for Vdd (2) (1) to Vdd(9). The fuzzy maximum operator models one fuzzy set with the maximum values returned by the output fuzzy set of each rule. In VHDL, one can use the maximum function presented in Algorithm 3. . This function should be called for each dmf_V(1), …,dmf_v(9) separately.

Implementation of the defuzzification stage
The last step is to perform the defuzzification process that converts the obtained fuzzy set into a single number as the output supply voltage. The aggregate output fuzzy set consists of a range of voltage output values and has to be defuzzified to determine a single output supply voltage value. For the defuzzification method, the centroid calculation is used to compute the final value. The centroid method computes the center of area under the curve of the fuzzy output set. From the min-max FIS, nine degrees of membership functions for each voltage set is obtained (dmf_V(1), …, dmf_V (9)). For each input value of the current and its derivative, there are a maximum of 3 dmf that have a nonzero value. Suppose that the aggregated output fuzzy set is as the one shown in Fig. 9.
To compute the output voltage value, as one can see from eq. (5), the following functions are needed to use: summation, multiplier and divider. Since implementing a divider block results in a circuit that occupies more area, we propose to use a look up table (LUT) stored in the memory of processor. This LUT needs to be filled out by the designer. Under this approach, the data stored in the memory estimates the center of gravity of the output fuzzy set obtained by the min-max Mamdani FIS. Here, we explain the required size of the memory and how to address and access to data in the LUT. If the centroid method for the defuzzification is applied, the output voltage value is as follows: To construct the LUT, we only use the first 3 most significant bits (MSB) of each voltage membership function. Since there is a maximum of three membership functions involved in calculating the final crisp voltage value, one needs to consider = words of the memory to make the desired LUT. Suppose we want to consider the whole 8 bits of each degree of voltage membership function value, the number of words in the memory changes to . For now, let's assume we have considered 3 MSBs for each degree of MF. Depending on the number of the active voltage membership functions and corresponding degree of membership functions obtained by the Mamdani FIS, one can access the corresponding word in the memory to access the output voltage value stored in it. The VHDL algorithm to access the proper memory address in the defuzzification part of the designed fuzzy controller is shown in Algorithm 5. Now each address of the memory should be filled out by a proper value to estimate the centre of gravity accordingly. We simulated all the corresponding possible situations for the aggregated fuzzy output voltage sets in MATLAB and estimate the output voltages. Then we stored all the corresponding values into the 512 bytes considered memory.

Synthesis results
We have implemented the proposed fuzzy logic controller in a CMOS 90nm technology and synthesized it with Cadence RC compiler to measure its power consumption and area. For benchmarking purposes, the synthesis of the circuit is done with different speeds. Synthesis specifications are mentioned in Table 1 The synthesis results are shown in Fig. 10. Since the fuzzy logic controller is a digital controller, its circuit does not consume much power and it does not occupy much area as shown in Fig. 10.
The main differences between the proposed VHDL implementation of the fuzzy controller and the other already implemented VHDL fuzzy controllers (Vuong et al., 2006;Vasantha et al., 2005;Sakthivel et all., 2010;Daijin, 2000) is about the speed of the controller. In the proposed implementation strategy, there are no multiplier and divider circuits used, and also we have considered a fixed slope value for the membership functions. For these reasons, the circuit naturally works faster. Since we have used the memory to store the defuzzification data, it is worth to mention that the power consumption of the proposed circuit is probably higher than previously reported ones. As way of example, we test the fuzzy logic circuit with the supply current profile of a processor when it executes a MPEG2-decoding application. The output result of the fuzzy logic circuit implemented in VHDL is shown in Fig. 11. The output signal of the fuzzy controller can accurately track the supply current variations. This output signal can be used to scale and adjust the supply voltage of the processor based on current variations for dynamic voltage scaling goals. Also in Fig. 11, the simulation result of the fuzzy controller implemented in Matlab is presented.

Conclusion
In this chapter, a dynamic fuzzy logic controller based on supply-current variation tracking for dynamic voltage and frequency scaling purposes was proposed. In the proposed method, the fuzzy logic controller decides about changing the supply voltage of the circuit under control by observing and predicting the supply-current variations. The simulation results showed the effectiveness of the proposed configuration in comparison to a PID controller. Furthermore, in this chapter, we described how to implement the proposed controller in VHDL. Also a new method for implementing the defuzzification stage in VHDL was proposed. The synthesized results of the implemented fuzzy controller in a CMOS 90nm technology, using Cadence RC compiler, evaluated in this chapter based on its power consumption and area.

Acknowledgment
This work was supported by the Dutch Technical Science Foundation (STW), under the agreement 363120-427.  www.intechopen.com