OWAIS AHMAD SHAH et al.: PVT IMMUNE POWER EFFICIENT DUAL EDGE TRIGGERED FLIP FLOP FOR PORTABLE APPLICATIONS UPTO GHZ FREQUENCY RANGE DOI: 10.21917/ijme.2022.0241

# PVT IMMUNE POWER EFFICIENT DUAL EDGE TRIGGERED FLIP FLOP FOR PORTABLE APPLICATIONS UPTO GHZ FREQUENCY RANGE

## Owais Ahmad Shah<sup>1</sup>, Geeta Nijhawan<sup>2</sup> and Imran Ahmed Khan<sup>3</sup>

<sup>1,2</sup>Department of Electronics and Communication Engineering, Manav Rachna International Institute of Research and Studies, India <sup>3</sup>Department of Electronics and Communication Engineering, Jamia Millia Islamia, India

#### Abstract

This paper presents a study to understand the behaviour of latest dualedge triggered flip flops (DETFFs) under extensive variations in voltage, temperature, frequency, data activity and corner cases for power, area and speed of operations. Six state-of-art DETFFs were considered and compared in 32 nm CMOS technology for their robustness under "Process, Voltage and Temperature (PVT)" variations. Simulations were carried out on T-SPICE with nominal operating conditions of 200 MHz clock frequency, 25 °C temperature, 0.9 V voltage and at 50% data activity. Results obtained showed that Khan's FF saves minimum of 13.82% power to maximum of 44.86% at nominal conditions. The Khan's FF is the fastest among all with minimum of 46.47% advantage over the others. At higher temperatures (>75 • C), Lee's FF outperforms all other designs. Wang's FF is the most area efficient FF and requires minimum of 2.03% to maximum of 24.92% less area. The DETFFs were tested for power efficiency as a 4-bit shift register, Lee's FF dissipated least power followed by Khan's FF.

#### Keywords:

Flip Flop, CMOS Digital Circuit, Low Power, Dual-Edge Triggered, Shift Register

## 1. INTRODUCTION

Modern VLSI circuits must carefully consider their power consumption, predominantly for applications requiring low power [1]. The power optimization strategies are used at various stages of the digital CMOS design process. However, one of the most crucial jobs to reduce the power is optimization at the logic level. Latches and flip-flops are crucial logic elements for digital systems' functionality [2]. D flip flop (DFF) in particular is often utilized in test applications and memory designs. The latency from the clock edge to the DFF output, the area and the output load capacitor of the flip flop are some issues in the design of flip flops [3]. The performance of a DFF as a whole is determined by variables like the power dissipation and clock (CLK) frequency of the flip flop. The flip flop clock tree's dynamic power consumption is greatly affected by changing the clock's frequency.

Pulse triggered flip-flops and master-slave flip-flops are often used in modern microprocessors [4]. The pulse triggered flip flops (PFFs) offers greater power efficiency and speed of operations but has low race tolerance. While, on the other hand, the master slave flip flops (MSFFs) compared to other flip flops has better race tolerance but has poor power and latency performance. The pulsed-based topology may still generate complicated logic. PFFs with minimal serviceable delay are better suited for mainframe architectures with frequency ranges of a few GHz due to circuit timing [5]. DETFFs when compared with single edge-triggered flip flops (SETFF) can help lower the CLK frequency to half while retaining the same data throughput, even if the clock frequency is dictated by system requirements [6]. Choosing circuit architecture with less underlying data activities within the internal nodes and few timed nodes is one of the utmost operative approaches to decrease total energy dissipation. For example, it is often possible to achieve lower energy usage for static circuits than for dynamic circuits. This is because each clock cycle in dynamic circuits must include both precharge and discharge operations for the dynamic nodes [7].

Without affecting system throughput, the clock frequency may be decreased by 50% by sampling the input data on rising edge of the CLK as well as the falling edge of the CLK. This effectively reduces the clock network's power dissipation in half, saving a large amount of system power in the process. However, implementing the DET function necessitates the addition of registers which helps to propagate, store and sample the inputs at both the rising and falling edge of the CLK. Despite being more complicated and often bigger than the SETFF counterparts, these dual edge-triggered flip-flops may be engineered to be more energy-efficient [8], which results in further power savings.

It has been extensively studied how to build storing elements those are activated on rising and falling CLK edges. Numerous options have been put out for the DET's design [4-18]. The transmission-gate latch-MUX is the most widely used of these cells because of its straightforward design, which is based on master slave latch and an output multiplexer. The C2MOS latch-MUX [19] is a different arrangement that may be created by swapping out the transmission gates with C2MOS gates. To implement a pulse-triggered DETFF, an alternative strategy is to produce a brief pulse known as clock gating on each clock transition, as demonstrated in [17]. The the symmetric pulse generator FF and conditional discharge FF are more sophisticated DETFFs that restrict the data activities through some precharge environments and conditional pulses.

Even though these topologies have been tested in a variety of applications, only a small number of DETFFs have been studied in deep process variations under wide temperature deviations, voltage fluctuations, corner and data activity analysis which will help to create energy-efficient systems. The employment of both clock phases (rising and falling edges) often produces some degree of clock overlay which may result in race around situations and some unfavorable circuit function, especially in the existence of significant process variation. For instance, when using the conventional DET, changes in PVT might make this overlay expand to the point where the input bit that is now retained is overwritten, leading to a catastrophic logic error.

### 2. REVIEW OF STATE-OF-ART DETFFS

The positive and negative latch architecture, whose output is integrated with the transmission gate (TG) based multiplexer (MUX), is shown in Fig.1 given by Lee in [13] referred as Lee's FF. This DETFF works by sending the latched data at the complement node to the output node of INV3 i.e. Q through the multiplexer. Two phase CLK signals are necessary for operating the transmission gate based MUX and may be produced by a straightforward inverter chain. However, due to a latency disparity between clock and the complement of clock (CLKB) at the TG-MUX, such architecture might cause data upset during clock overlap. Even with the addition of a tri-state inverter based MUX, the pull-down/up is still feeble. As a result, in Lee's FF, a fully static complementary MOS multiplexer is used to:

- · Protect bits from being disrupted by clock overlap and
- Maintain full swing on all nodes at all times.

By using minimization techniques of digital circuits, multiplexing task is performed entirely with the signals that exist internally. The resultant is a totally static complementary MOS MUX. The CMOS MUX may run without performance issues as all of the terminals in the master slave latch are static. The latch is also with full voltage swing. The CLK toggles from low to high and then from high to low. The cases where the preliminary value of D1 is low and the CLK and D from main input are also low, the upper latch holds the data stored already in the starting state. Whereas the cases where the preliminary D1 is '1' and main input D is also '1', the stack NMOS in the feedback-clocked inverter is OFF because X is '0'. However, because of the inverted output of the positive latch, all nodes are logically driven to VDD or GND, so the feedback invertor's pull-up path is of utmost importance. In the meanwhile, since node Z is AND operation'ed with CLK at the MUX, the output Q is completely dictated and controlled by node X. Since X is required to be low and CLK is high when the clock increases, the data previously held at the lower latch is transmitted to output Q over the multiplexer. For the upper latch or the positive latch, the input clock inverter is entirely ON and the feedback clock inverter is completely OFF thereby updating the internal data. Now at the falling edge of the CLK, the output Q receives the new data through the upper latch when the lower latch reverts to transparency. While this DETFF uses the same single phase clock technique the conditional clock signal (node X/Y) allow for a reduction in transistor count connected to CLK, resulting in a reduction in the power consumption induced by clock transition.

Another example of DETFF was proposed by Lapshev in [14] known as Lapshev's FF as shown in Fig.2 where a C-Element is used. C-element is a three terminal device that typically has 2inputs and 1-output. It was first designed in [20]. Once every input is identical, the output change the state to reflect the input value; when they are not identical, the prior output value is kept. The right arrangements of signal level on the input may be used to set and reset this device, which functions as a latch. Lapshev's DETFF overcomes the power issues owing to CLK changes since signal levels at node X and node Y toggle after each clock transition independent of what happens to input and output. The input stage of the FF must satisfy the following conditions in order for the C-element output to operate properly:

- During the time between clock edges, at least one of the nodes X and Y must remain at Q to prevent the output from flipping,
- Once the clock signal changes, both node X and Y must be at the input voltage level for the output Q to change to D.



Fig.1. Architecture of Lee's FF [13]



Fig.2. Architecture of Lapshev's FF [14]

The signal at node X or Y is unimportant for proper function of the output stage provided one of the nodes X or Y is kept at Q. One of the nodes X and Y components that aren't kept at output voltage can be at some permissible voltage with the C-element implementations without impacting the output. The Lapshev's FF meets these specifications by using two separate latches at X and Y. These two latches are redundant in nature and depends on signals like D and CLK. The weak inner C-elements of the implicit pulsed FF are connected across each other to guarantee proper functioning. Nodes X and Y are guaranteed to be at different voltage and that one of the two nodes will be at Q by cross-coupling.

The node that is not at output level in between clock shifts is neither accommodated for by the enhanced floating-node FF architecture, nor is its signal level reinforced. The response of node Z, which output then tails, in to the internal cross coupled weak C-elements is used to execute this behavior. If the signal arrangement of input and supplied CLK signal is in a way that the driving transistors of an input C-element keep the Q level at either X or Y, the other terminal is left floating. It doesn't impact the circuit's general static behavior as subsequently was already explained, the output cannot be affected by the signal level at the floating node provided the other terminal is at Q. When related to DETFF designs, this FF should ideally use less power during transitions in the clock signal without using more during input switching.

Third type of DETFF taken into account in this work is the Sabu's FF proposed in [15] and is depicted in Fig.3. The overall number of transistors in this flip-flop has been significantly decreased while maintaining performance and cell area. The clocked transistors are of greater consideration since they contribute extra to power dissipation. Due to the fact that it is a dynamic DETFF, the frequency requirements are lowered to half as double the data can be supplied at specific operating frequencies. As the operating frequency is reduced, the dynamic power consumption massively decreases.



Fig.3. Architecture of Sabu's FF [15]

Three stages make up the circuit: first, a string of transistors (both NMOS and PMOS), then two TGs, and finally the inverter. To create a certain delay in a flip-flop, series linked inverters are not used; instead, a powerful inverter is used to prevent noise coupling. The fewest possible transistors are used in its implementation to minimize parasitics. The clock driver cannot be removed since the TG requires a two phase clock signal. To completely eliminate data transmission deterioration caused by single channel MOS transistors, it requires both transistors (NMOS and PMOS) to do so.

Sabu's DETFF has a stacked structure. The stacking approach here is known as forced stacking of transistors (FST) approach and is used to modify the inverter, which is the third component of the circuit. To create a stacking effect in the output section, two extra transistors with identical widths are used in place of the pulldown and pull-up transistors. In contrast to the series stacking, this results in a distinct stacking effect where two transistors are made OFF at a certain moment creating the least amount of leakage current. In order to lessen the impact of the propagation delay, the FST is employed at the inverter part only. Devices with high threshold voltage and forced-stacked gates do not have quicker output edge rates. To lessen leakage current, several stacking techniques are used in CMOS devices. The clocked transistor portion employs series stacking. The circuit's transistors close to the power rails may also minimize static power in ON mode by implementing the FST approach. Compared to the available leakage reduction strategies, this FF is simpler to implement in microprocessor based digital circuit, but at the cost of propagation delay.

A different DETFF was proposed by Huang in [16] and is illustrated in Fig.4. This study suggests a C-element-based antiinterference low-power DETFF. The DETFF has a clock tree made up of three C-element output stages, two internal latches employing the enhanced C-element, and inverters INV1 and INV2 attached to the input clock signal. The clock signal and inverted clock signal are produced by this clock tree circuit. When the CLK signal is '1'or '0', internal latches, one receiving clock as CLKa and the other receiving the clock as CLKb, latches the D input signal. The output stage sends to the output Q the value of D that has been latched by the internal latch. When D varies between the clock edges, the output Q does not. Second latch maintains a high impedance state, logic is in state '1', and node X is also in state '1' between the very first rising edge of the CLK and the first falling edge. The output Q is now high, the logic state is '1'. When D is reversed, Y is switched to low logic state, X is maintained at high logic state, and the output Q is now '1'. The output Q remains '1' after the C-element. The output signal Q will attain the state of D and will not change anyway.



Fig.4. Architecture of Huang's FF [16]

Node Y is at high and the output Q is low just before CLK falling edge (second fall), X and D are also low at this point. D stays low even after this falling edge (second). At this moment, node Y and X both are high, and the state is the previous low logic, hence the output Q continues to be at '0'. Only when CLK comes and Q differs from D will the output Q switch to D. Output node will always keep the prior state, while node X or Y will always remain in the state of D. The terminal that is not in input state D will switch to D as soon as the clock edge appears. Output Q will change to D at a point when the states of nodes X and Y are identical to D, which causes triggering on both edges. Among these, the enhanced C-element's utilization result in the highimpedance state, which is when first or second latch is in the turned OFF state and has no impact on the FFs static properties. The circuit's power consumption decreases at the same time since redundant transitions are removed.

The Fig.5 is the architecture of Wang's DETFF [17]. To obtain the correct functionality when employing pulse generation for

DETFFs, a large number of transistors are required. Since several transistors are connected to CLK signal, the activity factor will rise, leading to higher power usage. Therefore, the dual data-path concept is applied in a targeted manner. Additionally, transmission gate (TG) is employed to reduce the undesired voltage effects (threshold) that cause pass transistors to produce weak low/high signals. Likewise, there is a parallel connection between the two data-paths. On the rising edge of the CLK, the higher path for data is activated, and on the falling edge of CLK, the lower data path is initiated. When the front TG (TG1 and TG2) is closed, an inverter (INV1 and INV2) and a PMOS transistor (P1 and P2) are used in conjunction to maintain the high/low logic level. The inverter turns the signal to low when the input D is '1', which causes the PMOS transistor to pull the data up to logic '1'. The inverter turns the signal to '1' when the input D is '0', isolating the data from VDD and maintaining the low value. Every transistor in this circuit is sized at minimal thereby significantly saving the layout area.



Fig.5. Architecture of Wang's FF [17]

The Fig.6 depicts the architecture of Khan's FF proposed in [18]. This flip-flop has two data pathways and is essentially a Master Slave flip-flop structure. The transmission gate TG1, invertor INV1 and TG3 make up the top data route. Transmission gate TG2, inverter INV3 and transmission gate TG4 make up the lower data route. The input of the flip flop D is linked to TG1 and TG2, and inverter INV5's output is obtained from the input of TG3 and TG4. Both data streams are coupled by feedback loops that keep the output logic level constant even when the clock is halted, maintaining the static functionality. In order for the top data path to function as a positive edge-triggered FFand the lower end data path to function as a negative edge-triggered FF, the TGs in both data paths are timed at different rates. Inverter INV2 and pass transistor P1 make up the feedback in the upper data route, and inverter INV4 and pass transistor P2 make up the feedback in the lower data path. The only difference between this design and Fig.5 is the feedback. Since the feedbacks are not on essential pathways, pass transistors coupled with invertors were used to increase the flip flop's power efficiency. This also made the design static with increased performance, reduced power consumption and with less number of transistors among the design's key benefits.



Fig.6. Architecture of Khan's FF [18]

### 3. SIMULATION RESULTS, FEATURE ANALYSIS AND DISCUSSIONS

Six latest state of art DETFF discussed in section 2 are extensively simulated and compared to investigate the performance parameters of each flip flop. Simulations are carried out in 32 nanometre CMOS technology on T-SPICE using standard PTM performance models. The nominal operating conditions are 25 °C temperature, 200 MHz clock frequency, 0.9 V functional voltage and 50% data activity. The input bit stream used is a 16-bit binary data.

The Fig.7 shows the average power consumed by the flip flops at variations of  $\pm 10\%$  in the input voltage. When average power alone is taken into consideration, it was observed that Huang's FF dissipated more power at all voltage levels whereas Khan's FF performed well at nominal and higher voltages. However for lower voltages Lee's FF outperforms the rest of the DETFFs. The delay calculations on this investigation are shown in table 1. In terms of speed of operation Khan's FF and Huang's FF are the fastest flip flops with delays ranging in femtoseconds compared to picoseconds in others.



Fig.7. Avg. power at different voltage levels

Table.1. Delay in pS at variations in voltage

| Voltage (V)  | 0.81  | 0.85  | 0.9   | 0.95  | 0.99  |
|--------------|-------|-------|-------|-------|-------|
| Lee's FF     | 14.53 | 12.94 | 11.34 | 10.01 | 9.08  |
| Lapshev's FF | 66.11 | 58.64 | 49.71 | 29.31 | 41.45 |

| Sabu's FF  | 103.6 | 66.47 | 39.01 | 29.05 | 25.31 |
|------------|-------|-------|-------|-------|-------|
| Huang's FF | 0.944 | 0.887 | 0.835 | 0.796 | 0.774 |
| Wang's FF  | 16.65 | 14.55 | 12.93 | 10.89 | 10.15 |
| Khan's FF  | 0.537 | 0.488 | 0.447 | 0.419 | 0.407 |

Power delay product (PDP) is the most important parameter in any digital IC design. The Fig.8(a) shows the PDP of all the DETFFs at variations in voltage. The Khan's FF because of better power efficiency and less latency out classes the rest flip flops but followed closely with marginal difference by Huang's FF. Sabu's FF has the worst PDP followed by Lapshev's C-element FF. The PDP analysis is incomplete if not performed at a wide temperature range. This examination was carried out from minimum temperature of -40 °C to a maximum of 125 °C temperature, the result of which is shown in Fig.8b. Although Khan's FF and Huang's FF performed well at all temperatures, it is worth noticing here that Lee's FF has better PDP at temperatures of above 75 °C compared to other DETs.



Fig.8. PDP results (a) at different voltages, (b) at different temperatures

The Table.2 is the detailed comparison of delay and size requirements of the flip flops. Wang's FF is the most compact design among all the FFs having minimum sum of widths and least count of transistors. The Fig.9a and 9b showcase the area advantage of Wang's FF over other FFs. It was also observed that Lee's FF and Sabu's FF have the highest transistor count and sum of width respectively.

All DET flip flops were also tested across a wide frequency band ranging from 100 MHz to 1 GHz. At higher frequencies i.e frequencies of greater than 500 MHz, Sabu's FF was consuming the highest power whereas at frequencies of less than 500 MHz Huang's FF was dissipating more power. Khan's FF yielded better power results at all frequencies bar frequency of 500 MHz. These power results at various frequencies can be seen in Fig.10.

It is important to know the behaviour of flip flop at different input data activities. To conduct this test, seven different patterns at four different frequencies were conducted for a conclusive outcome. The test patterns used were the data activities of the range from 100% to 0%. For null data activity, both all high logic and all low logic were considered.

Table.2. Delay and Area Comparison of Different Flip Flops

| Flip flop                    | Lee's<br>FF | Lapshe<br>v's FF | Sabu'<br>s FF | Huang<br>'s FF | Wang'<br>s FF | Khan'<br>s FF |
|------------------------------|-------------|------------------|---------------|----------------|---------------|---------------|
| CLK-Q delay at<br>-40°C (pS) | 10.67       | 34.01            | 34.53         | 0.459          | 9.02          | 0.232         |
| CLK-Q delay at<br>0°C (pS)   | 9.75        | 23.35            | 38.33         | 0.661          | 8.54          | 0.349         |
| CLK-Q delay at 25°C (pS)     | 11.34       | 49.71            | 39.01         | 0.835          | 12.93         | 0.447         |
| CLK-Q delay at 50°C (pS)     | 12.66       | 29.22            | 47.41         | 1.05           | 14.7          | 0.581         |
| CLK-Q delay at<br>75°C (pS)  | 0.726       | 34.61            | 51.08         | 1.301          | 15.66         | 0.771         |
| CLK-Q delay at<br>100°C (pS) | 0.94        | 39.59            | 54.64         | 1.609          | 16.89         | 1.143         |
| CLK-Q delay at<br>125°C (pS) | 1.268       | 21.56            | 56.94         | 1.996          | 1.282         | 2.194         |
| No. of<br>Transistors        | 36          | 34               | 22            | 32             | 18            | 22            |
| Sum of width<br>(um)         | 8.08        | 8.16             | 10.04         | 7.68           | 7.52          | 7.76          |
| D-Q delay<br>(nS)            | 1.298<br>9  | 1.3247           | 1.314         | 1.2758         | 1.2879        | 1.2754        |
| layout area<br>(um2)         | 0.258<br>56 | 0.26112          | 0.321<br>28   | 0.2457<br>6    | 0.2406<br>4   | 0.2483<br>2   |

The frequencies used are 100 MHz, 200 MHz, 500 MHz and 1 GHz. All DETFFs functioned correctly to these variations, the result of which is shown in Fig.11. As expected, Sabu's FF at frequencies of greater than 500 MHz at all test patterns consumed the highest power whereas Huang's FF at higher data activities (>50%) and at frequency levels of 100 MHz and 200 MHz was consuming the highest amount of power. Apart from frequency of 500 MHz, Khan's FF performed better in general specifically for data activities ranging from 25% to 75%. No conclusive results could be obtained for low data activities as all DETFFs behaved differently at these variations as is evident from Fig.11.









Fig.10. Avg. power at different frequencies

DETFF testing is incomplete until not tested at extreme PVT variations. For the same simulations were further performed on corner cases. The Fig.12 is the average power calculations at all different corner scenarios. Khan's FF at corner case of FF, FS and TT performed well while Lee's FF had the minimum power at corner case of SF and SS. The corner case PDP analysis was also performed on both CLK-Q and D-Q delay, nearly the same results were observed for PDP of D to Q as was the case in Fig.12. When CLK to Q delay was taken into account, Khan's FF stood stead at FF, FS and TT but was the slowest at SF where Wang's FF was fastest. At SS case, Lee's FF performance was better than all others. The same can be seen in Fig.13.









(d)

Fig.11. Avg. power at variations in data activity (a) at 100 MHz, (b) at 200 MHz, (c) at 500 MHz, and (d) at 1 GHz



Fig.12. Avg. power at various corner cases



Fig.13. PDP analysis for corner cases (a) CLK-Q PDP (b) D-Q PDP

## 4. DUAL EDGE TRIGGERED SHIFT REGISTER

DETFFs studied in this work are implemented as 4 bit shift register to verify the driving capability and worthiness of the flip flops. This test was performed at three different voltage levels for both average power and RMS power. Table 3 shows the power results of these flip flops, Lee's FF had better driving capability and consumed the least average power followed by Khan's FF. Wang's FF had the worst driving capability and was not able to drive the output nodes to desired voltage levels. In terms of RMS power, it was Huang's FF outperforming others but worth mentioning here is the Khan's FF that had the highest RMS power. This shift register was also tested at frequencies of 100 MHz, 200 MHz and 500 MHz results of which are shown in Fig.14. At all variations in frequency Lee FF yielded the better power results.

Table.3. Power results of shift register

| Flip Flop    | (uW  | rage Po<br>') at var<br>oltage (V | ious  | RMS Power (uW)<br>at various voltage<br>(V) |      |      |  |
|--------------|------|-----------------------------------|-------|---------------------------------------------|------|------|--|
|              | 0.81 | 0.9                               | 0.99  | 0.81                                        | 0.9  | 0.99 |  |
| Lee's FF     | 2.73 | 3.65                              | 4.82  | 33.1                                        | 43.8 | 52.9 |  |
| Lapshev's FF | 4.32 | 5.45                              | 6.89  | 25.8                                        | 35.5 | 47.4 |  |
| Sabu's FF    | 4.02 | 4.99                              | 7.24  | 29.9                                        | 37.9 | 52.7 |  |
| Huang's FF   | 5.41 | 8.12                              | 13.12 | 23.9                                        | 33.1 | 45.1 |  |



Fig.14. Average power of shift registers at different frequencies

#### 5. CONCLUSION

Six latest DET flip flops have been presented and compared in 32 nm CMOS technology. Significant and extensive simulations were carried out to demonstrate the robustness in terms of power efficiency of these flip flops. For voltages less than 0.9 V, Lee's FF is recommended whereas Khan's FF for greater than 0.9 V if power alone is the factor in digital circuits. Lee's FF is also recommended for temperature sensitive circuits with temperatures of greater than 75 °C. The optimal PDP of Khan's FF was found to be better than the rest. Khan's FF also showed better results for data activities of 25 % to 75%. For these reasons, Khan's FF is the best candidate suitable for portable devices followed by Huang's FF as the second best. No conclusive results were found for low data activities although at nominal operating conditions for 0% activity (all low) Huang's FF showed better results. If area is a major constraint in the design then Wang's FF is a go to architecture because of its compactness. The designs studied were also implemented as a 4-bit shift register, Lee's FF was found to have better driving capability and matchless power competence.

### REFERENCES

- J.L. Shin, R. Golla, H. Li, S. Dash, Y. Choi and A. Smith, "The Next Generation 64b SPARC Core in a T4 SoC Processor", *IEEE Journal of Solid-State Circuits*, Vol. 48, No. 1, pp. 82-90, 2013.
- [2] O.A. Shah, I.A. Khan, G. Nijhawan and I. Garg, "Low Transistor Count Storage Elements and their Performance Comparison", *Proceedings of International Conference on Advances in Computing, Communication Control and Networking*, pp. 801-805, 2018.
- [3] P.C. Hsieh, J.S. Jhuang, P.Y. Tsai and T.D. Chiueh, "A Low-Power Delay Buffer using Gated Driver Tree", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 17, No. 9, pp. 1212-1219, 2009.
- [4] M.W. Phyu, K. Fu, W.L. Goh and K.S. Yeo, "Power-Efficient Explicit-Pulsed Dual-Edge Triggered Sense-Amplifier Flip-Flops", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 19, No. 1, pp. 1-9, 2011.

- [5] H. Thapliyal, N. Ranganathan and S. Kotiyal, "Design of Testable Reversible Sequential Circuits", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 21, No. 7, pp. 1201-1209, 2013.
- [6] A. Bonetti, A. Teman and A. Burg, "An Overlap-Contention Free True-Single-Phase Clock Dual-Edge-Triggered Flip-Flop", *Proceedings of IEEE International Symposium on Circuits and Systems*, pp. 1850-1853, 2015.
- [7] M.W. Phyu, W.L. Goh and K. S. Yeo, "A Low-Power Static Dual Edge-Triggered Flip-Flop using an Output-Controlled Discharge Configuration", *Proceedings of IEEE International Symposium on Circuits and Systems*, pp. 2429-2432, 2005.
- [8] L.Y. Chiou and S.C. Luo, "Energy-Efficient Dual-Edge-Triggered Level Converting Flip Flops With Symmetry in Setup Times and Insensitivity to Output Parasitics", *IEEE Transactions on Very Large Scale Integration (VLSI)* Systems, Vol. 17, No. 11, pp. 1659-1663, 2009.
- [9] X. Wang and W.H. Robinson, "Asynchronous Data Sampling Within Clock-Gated Double Edge-Triggered Flip-Flops", *IEEE Transactions on Circuits and Systems I: Regular Papers*, Vol. 60, No. 9, pp. 2401-2411, 2013.
- [10] C.Y. Kim and H.C. Lee, "Low-Power, High-Sensitivity Readout Integrated Circuit with Clock-Gating, Double-Edge-Triggered Flip-Flop for Mid-Wavelength Infrared Focal-Plane Arrays", *IEEE Sensors Letters*, Vol. 3, No. 9, pp. 1-4, 2019.
- [11] N. Kawai, S. Takayama, J. Masumi, N. Kikuchi and K. Ogawa, "A Fully Static Topologically-Compressed 21-Transistor Flip-Flop with 75% Power Saving", *IEEE Journal of Solid-State Circuits*, Vol. 49, No. 11, pp. 2526-2533, 2014.
- [12] M. Alioto, E. Consoli and G. Palumbo, "Analysis and Comparison in the Energy-Delay-Area Domain of Nanometer CMOS Flip-Flops: Part I—Methodology and

Design Strategies", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 19, No. 5, pp. 725-736, 2011.

- [13] Y. Lee, G. Shin and Y. Lee, "A Fully Static True-Single-Phase-Clocked Dual-Edge-Triggered Flip-Flop for Near-Threshold Voltage Operation in IoT Applications", *IEEE Access*, Vol. 8, pp. 40232-40245, 2020.
- [14] S. Lapshev and S.M.R. Hasan, "New Low Glitch and Low Power DET Flip-Flops using Multiple C-Elements", *IEEE Transactions on Circuits and Systems I: Regular Papers*, Vol. 63, No. 10, pp. 1673-1681, 2016.
- [15] N.A. Sabu and K. Batri, "Design and Analysis of Power Efficient TG based Dual Edge Triggered Flip-Flops with Stacking Technique", *Journal of Circuits, Systems and Computers*, Vol. 29, No. 8, pp. 2050123-2050134, 2019.
- [16] Z. Huang, X. Yang, T. Song, H. Qi, Y. Ouyang and T. Ni, "Anti-Interference Low-Power Double-Edge Triggered Flip-Flop based on C-Elements", *Tsinghua Science and Technology*, Vol. 27, No. 1, pp. 1-12, 2022.
- [17] X. Wang and W.H. Robinson, "A Low-Power Double Edge-Triggered Flip-Flop with Transmission Gates and Clock Gating", *Proceedings of IEEE International Midwest Symposium on Circuits and Systems*, pp. 205-208, 2010.
- [18] I.A. Khan, D. Shaikh and M.T. Beg, "2 GHz Low Power Double Edge Triggered Flip-Flop in 65nm CMOS Technology", *Proceedings of IEEE International Conference on Signal Processing, Computing and Control*, pp. 1-5, 2012.
- [19] A. Gago, R. Escano and J.A. Hidalgo, "Reduced Implementation of D-Type DET Flip-Flops", *IEEE Journal* of Solid-State Circuits, Vol. 28, No. 3, pp. 400-402, 1993.
- [20] D.E. Muller, "Theory of Asynchronous Circuits", Internal Report, Digital Computer Lab., University of Illinois, No. 66, pp. 1-124, 1955.