DESIGN OF STANDARD CELL ASIC’S USING SELF GATED RESONANT CLOCKED FLIP FLOP

Shilpa K.S1, Ajith Ravindran2 and Saranya P.M3
Department of Electronics and Communication Engineering, Saintgits College of Engineering, India

Abstract
Efforts to reduce power consumption of digital CMOS circuits have been in progress for nearly three decades. As a result, a number of well understood and proven techniques for reducing dynamic and leakage power have been developed. These methods are implemented thoroughly in the circuit level. One of the example for high level circuit is a standard cell Application Specific Integrated Circuit (ASIC). Reducing the power and delay of standard cell ASIC can improve the performance of the system designed using these. A major contributor to the total power in modern microprocessors is the clock distribution network, which can dissipate as much as 70% of the total power for high performance applications. Self-gated resonant clocked flip-flop optimized for power efficiency and signal integrity achieves reduced dynamic power dissipation, in addition to the negative setup time, which makes the design more tolerant to the clock skew. This feature also reduces the D-Q delay, thus improving the timing performance of the flip-flop. The advantages of the Self gated resonant clocked flip-flop are implemented on standard cell ASICs. Cadence EDA tools and the 180nm process technology files have been used to substantiate the merits of the proposed design.

Keywords: Application Specific Integrated Circuit (ASIC); SGR Clocked Flip-Flop

1. INTRODUCTION

A Number of well-established methods has been used to minimize the static as well as dynamic power dissipation. The static power dissipation can be reduced by using stack transistors, using dual voltage supply, dual threshold voltages, and by using adaptive body biasing. By reducing the power supply, load capacitance we can reduce the dynamic power. The methods to minimize the static and dynamic power level have been explored thoroughly since most of the techniques adopted are used and implemented in the logic/circuit level. So we have to shift our concentration towards high level circuits. One of the example for high level circuit is a standard cell Application Specific Integrated Circuit (ASIC). Reducing the power and delay of standard cell ASIC can improve the performance of the system designed using these. A major contributor to the total power in modern microprocessors is the clock distribution network, which can dissipate as much as 70% of the total power for high performance applications. Self-gated resonant clocked flip-flop optimized for power efficiency and signal integrity achieves reduced dynamic power dissipation, in addition to the negative setup time, which makes the design more tolerant to the clock skew. This feature also reduces the D-Q delay, thus improving the timing performance of the flip-flop. The advantages of the Self gated resonant clocked flip-flop are implemented on standard cell ASICs. Cadence EDA tools and the 180nm process technology files have been used to substantiate the merits of the proposed design.

The digital design that make it possible to focus on high level design aspects of a digital design is a called a Standard Cell Design. Usually a Standard Cell is designed and inserted into a large library where some logic functions have different implementation. A Standard Cell will help a simple single function IC’s to a complex multi-million gate System on Chip. The timing performance, power and area of each of these implementations are different. A technological library is a complete group of standard cell descriptions. Large systems are implemented using these standard cells in the library. By reducing the power consumption of standard cell we can reduce the total power of the system. In this paper a design of ASIC using SGR flip-flop is described.

It includes the circuit architecture of Standard Cell ASIC’s (section 1), Design of Self Gated Resonant clocked Flip Flop (section 2, Standard cell ASIC Cell using SGR (section 3, Conclusion (section 4).

2. STANDARD CELL ASIC

In this section the architecture of a standard cell ASIC is mentioned [1]. The standard cell design is having the presence of two cross coupled NAND gate hence it is also called a Pseudo NAND cell (PNAND).

![Fig.1. Architecture of Standard Cell ASIC](image)

2.1 CELL OPERATION

The schematic diagram of a standard cell ASIC is shown in Fig.1. The cell consist of (A) two group of parallel PFETs, It’s usually denoted as Left Input Network (LIN) and Right Input Network (RIN), (B) Sense Amplifier (SA) which is composed of two cross coupled NAND gates and a Set and Reset (SR) Latch. The cell has two phases of operation they are reset (clk = 0) phase and Evaluation (clk 0 → 1) phase [1].

2.1.1 Reset Phase:

The nodes N5 and N6 are pulled down to low by the pull-down transistors M18 and M19, when clk = 0. These turn OFF transistors M5 and M6 that breaks all connections from N1 and N2 to the ground. In supplementary to these M7 and M8 are ON that pull up nodes N1 and N2 to high. The transistors M3 and M4 are also ON since N1 and N2 are high. The nodes N1 and N2 are connected as the input of the SR latch. The state of the latch won’t change since N1 and N2 are high.
2.1.2 Evaluation Phase:

The cell is in evaluation phase when clk (0→1), an input is given in such a way that r active device in LIN and l active device in the LIN. The assignment of signal is in such a way that l ≠ r. The conductance of LIN is greater than that of RIN by considering that l > r. Since nodes N5 and N6 are pulled to high, the transistors M18 and M19 are turned OFF. Node N5 will be raised to high first due to the higher conductivity of LIN which in turn on transistor M5. When M3 = 1, the node N1 start discharging through M3 and M5. Due to the lower conductance of RIN, node N6 will charge after a small delays, this allows N1 to turn ON M2 and turn OFF M4. N2 will be pull back to 1 as M2 is turned ON, hence the output node N1 is 0 and N2 is 1. As the circuit and its operation are symmetric, if l < r then the evaluation will result in N1 = 1 and N2 = 0.

The SR latch stores the value of signals at nodes N1 and N2. During the reset phase (N1, N2) = 1, the output of the SR latch does not change. After evaluation phase if (N1, N2) = (0,1) the output will become 1 and if (N1, N2) = (1,0), the output will become 1. The difference in conductivity between LIN and RIN is sensed and amplified by the sense amplifier. The cell operates more efficiently and reliably as the difference between the conductivity of LIN and RIN increases [4]. Identical VD, VG and VS of each transistor in LIN and RIN will be achieved by shorting the sources of transistors M16 and M17. M9 and M10 in the LIN and RIN are meant to provide internal feedback with transistors. The robustness of the cell is improved by node because of this transistor drives nodes N1 and N2.

Transistors M9 and M10 is used to avoid the high impedance states (HiZ0, HiZ1) of the nodes N5 and N6. Hence, due to these reason they are called keeper transistors. Consider a case in which no active devices in the RIN and some active devices in LIN. N5 and N6 will be pulled down to zero after reset and during the evaluation phase when clk 0→1 N6 will be HiZ0 and N5 will be pulled to high. During these instant N1 = 0 and N2 = 1 will be evaluated by the circuit, M4 is inactive and M3 is active. Now, conversely consider the condition where some of the devices in RIN are active and all the devices in LIN are inactive. Now, N1 remain at 0 but N5 become HiZ1 and will keep M4 inactive. When N6 rises to 1, turning M6 ON. M4 remains inactive as long as M2 is active and no change will take place. N5 which is at HiZ1 will be affected and tends to being discharged. If N5 discharges, N1 rises will turn ON M4 and N2 will discharge will lead to complement the output node. Transistors M9 and M10 never let the node N5 and N6 to become HiZ0 or HiZ1. The Fig.2 shows the simulation results of Standard Cell ASIC.

3. SELF GATED RESONANT CLOCKED FILP-FLOP

The clock distribution network consumes about 70% of the total power of modern microprocessor. Clock power dissipation will increase as the complexity of the VLSI systems increase [5, 6]. Some energy recovery Flip Flops are Sense Amplifier Energy Recovery Flip Flop (SAER), Static Differential Energy Recovery Flip Flop (SDER), Differential Ended Conditional Capturing Energy Recovery Flip Flop (DCCER) [7, 8].

The SAER Flip-Flop is having faster operation but during the precharged condition its nodes will continuously set and reset. For overcoming these drawbacks Static Differential Energy Recovery Flip Flop is introduced. This design is based on the method of conditional capturing and evaluation. Differential Ended Conditional Capturing Energy Recovery Flip Flop [10] is another design that removes the unnecessary internal transitions associated with the conditional capturing.

In this section we discuss about the Self Gated Resonant Clocked Flip Flop [2] which is having the characteristics like reduced dynamic power dissipation due to the reduction in number of transistors and nodal capacitance. It avoids charge sharing, charge leakage and floating node and it exhibits negative setup time. The clock signal consumes more power; a relevant solution for this issue is using a resonant clock. Power clock signal having sinusoidal or trapezoidal shape having inherent circuit characteristics to recover all the energy used in the charging the nodes [9]. The power dissipation can be minimized by adiabatic principle of charging a node capacitance.

The self-gated resonant clocked Flip Flop is shown in Fig.3. This is a quasi-adiabatic structure using resonant clocking. Self-Gated Resonant clocked Flip Flop consist of pull up network. The pulls up network consist of two cross coupled PFETs P1 and P2. Transistors N1, N2, N3 and N4 are included in pull down network. Transistor N4 is fed resonant clock PCLK. The gate of N3 is fed PCLKB. The floating nodes are avoided by using the transistors N5 and N6 hence they are low impedance transistors.
The charge leakage that happens during the evaluation phase is reduced by using the transistors N1, N3, and N4 (N2, N3, and N4), they are also called stacked transistors. When all the three transistors become ON, simultaneously charge sharing will occur. These can be avoided by keeping N4 at the bottom of stack which is having relatively large size. The enable signal EN is one of the input to the NOR gate.

Consider QB = 1, the internal nodal capacitance CA and CB at nodes A and B as shown in Fig.3. The discharging current may flow between the internal node and output according to the input that applied to the transistors N1/N3/N4 (and N2/N3/N4). The discharge current will flow between load capacitance and output node. If we are considering the transistor N1, while considering N3 it has to carry current from capacitance CL and CA. If N4 is considered it carries current from CL, CA and CB. Hence, to reduce the delay by using progressive sizing transistors. N4 will experience body effect and thereby reducing the leakage power dissipation by stacking the gating transistors over.

Conventionally square wave is used as the clock signal and AND logic is used for clock gating [10] while using these square shaped clocks energy recovery is not possible. If in the path of a sinusoidal clock an AND gate used then it will destroy the shape of the clock. Hence, by introducing a NOR gate the SGR implements a clock gating scheme. The two inputs to the NOR gate are EN and PCLK. When enable signal is low, the second input PCLK applied is inverted and gives a power clock PCLKB. When EN is high, it gates the signal clock PCLK and we can say that when enable is high the circuit is in idle state or disables N3. When EN is high it disables the internal clock by setting the output of the NOR gate to zero. Hence, through N4 the pull down path of SGR is turned OFF.

4. OPERATION OF SELF GATED RESONANT CLOCKED FLIP FLOP

During the active mode of operation enable signal is low and input D is applied. By PCLK signal N4 is activated and PCLKB will activate N3. The input applied that is D, is captured during rising transition of data pulse and these will not happen during falling transition. When the data input (D = 1), the transistors N3, N4, and N1 become ON and the output QB become low. Since the transistor P2 is ON output Q becomes high because the device. Data applied will be latched to the output during the rising edge of clock. The sinusoidal power clock PCLK recycles the power supply during the recovery phase. By turning OFF the transistors N1, N3 and N4 during the recovery phase the storage nodes Q and QB is separated from the lower half of the circuit. This design avoids better charge recovery and avoids internal redundant switching. The least time required for the output to achieve a steady state before it’s latched by an active transition by the control input [11]. Before the clock signal become active, the least possible setup time is needed for the D-Flip-Flop to maintain the data input D. The SGR is having negative setup time and hence the delay of the flip-flop will also minimize. The energy recovery is done by using a clock that changes with time. The potential between these two terminals will reduce when this power supply is applied between the source and drain of a MOS device. Thereby lower the dissipation of channel by the pull up and pull down network. Energy will be conveyed to the node during evaluation phase. As long as the amplitude of the PCLK signal is greater than the threshold voltage of the devices in the recovery path, the energy is recycled during the evaluation phase. The nodal charge is recovered. The Fig.4 shows the simulation results of SGR.

![Fig.4. Simulation results of Self Gated Resonant Clocked flip flop](image)

4.1 STANDARD CELL ASIC USING SGR

The design of the Standard Cell ASIC using the SGR can be implemented by replacing the Set-Reset Latch in the architecture by using the SGR. SGR is having lesser number of transistors than S-R Latch. As the number of transistors decreases, the power consumption will be reduced. Also the SGR is a D flip-flop hence the problem of not allowed state of SR latch can be eliminated. The D flip-flop passes what we are giving at the input. The Fig.5 shows the diagram of Standard Cell ASIC using SGR.

![Fig.5. Standard Cell ASIC using SGR](image)

The Fig. 6 shows the simulation results of Standard Cell ASIC using SGR.

![Fig.6. Simulation results of Standard Cell ASIC using SGR](image)
The Table 6 shows the performance comparison of standard cell ASIC using SR latch and SGR.

Table 1. Comparison of Standard Cell ASIC’s using SR latch and SGR

<table>
<thead>
<tr>
<th>Circuit</th>
<th>Delay (in ns)</th>
<th>Power (in mW)</th>
<th>Area (No. of transistors)</th>
</tr>
</thead>
<tbody>
<tr>
<td>Standard Cell ASIC using SR Latch</td>
<td>4.179</td>
<td>2.252</td>
<td>30</td>
</tr>
<tr>
<td>Standard Cell using SGR</td>
<td>2.501</td>
<td>1.772</td>
<td>26</td>
</tr>
</tbody>
</table>

5. CONCLUSION

In this paper, we have proposed the design of a Standard Cell ASIC’s using Self Gated Resonant Clocked Flip Flop. The circuit is implemented using cadence 180nm technology. The proposed design achieves lower power consumption as compared to the already existing standard cell ASIC’s design. The power consumed by the Standard Cell ASIC’s using SGR is 1.77mWatt and its delay is only 2.501ns which is much lesser than that of the ordinary standard cell. Hence, by implementing the proposed design of standard cell ASIC and inserting it into the standard cell library it’s possible to design a number of energy efficient as well as fast systems.

REFERENCES


