OWAIS AHMAD SHAH et al.: 100MHZ-2GHZ PULSE TRIGGERED FLIP-FLOPS IN 32NM TECHNOLOGY FOR LOW POWER AND HIGH PERFORMANCE DIGITAL CMOS CIRCUITS DOI: 10.21917/ijme.2022.0236

# 100MHZ-2GHZ PULSE TRIGGERED FLIP-FLOPS IN 32NM TECHNOLOGY FOR LOW POWER AND HIGH PERFORMANCE DIGITAL CMOS CIRCUITS

#### Owais Ahmad Shah<sup>1</sup>, Geeta Nijhawan<sup>2</sup> and Imran Ahmed Khan<sup>3</sup>

<sup>1,2</sup>Department of Electronics and Communication Engineering, Manav Rachna International Institute of Research and Studies, India <sup>3</sup>Department of Electronics and Communication Engineering, Jamia Millia Islamia, India

#### Abstract

This study presents extensive work carried out on pulse triggered flip flops (P-FFs) for power consumption, area requirements and delay measurements. Six latest state-of-art P-FFs are used to determine these performance parameters. The flip flops are Conditional Pulse Enhancement P-FF (CPEPFF), Signal Feed-through P-FF (SFTPFF) Karimi's P-FF (KPFF), Conditional Feed-through P-FF (CFTPFF), Dual Dynamic node hybrid FF (DDFF), and Dual-edge Implicit FF with an embedded Clock Gated Scheme (DIFF-CGS). Simulations are carried out at 32nm CMOS technology on T-SPICE at operating conditions of 500MHz clock frequency, temperature of 25°C with 50% data activity. Results showed that CFTPFF consumes the least average power with minimum reduction of 27.94% and maximum of 57.45%. Even at higher frequencies and varying data activities CFTPFF outperforms other FFs in power dissipation. DDFF is the fastest P-FF with minimum enhancements of 82.7% and maximum 94%. In terms of power delay product (PDP), the optimal PDP of DDFF is best among all the P-FFs whereas DIFF-CGS has the worst. The area overhead of KPFF and CFTPFF is better compared to the rest of P-FFs.

#### Keywords:

Flip Flop, CMOS Digital Circuit, Low Power, High-Speed, Pulse Triggered

## **1. INTRODUCTION**

Flip-flops (FFs) are pivotal storage components that are utilized in a variety of digital CMOS circuits and microprocessors [1]. The industry is calling for the creation of low-power architectures to enable the expansion of chip capability. The modern digital architectures, specifically, use several FF modules in particular shift registers, register files, first-in-first-out, as well as heavy pipelining methods. The widespread application of FFs in pipelining methods demonstrates the significance of power efficiency. When dealing with integrated circuits, the issue of power efficiency becomes increasingly crucial [2].

Microprocessors and memory cells, either, are not immune to this problem. Additionally, it is anticipated that the power used by the clock system, which consists of storage elements and clock distribution networks would account for nearly 50% of the system power [3]. The total power usage and processor clock cycles are significantly affected by power consumption and data-to-output (D to Q) delay [4]. As a result, high-performance central processing units (CPUs) must include energy-efficient storage components. Therefore, FF plays a crucial role in the overall system design in terms of chip area and power consumption.

In context to high-speed operations, P-FF has been viewed as a popular alternative for the traditional master slave FF. In addition to improving speed, the clock tree system's circuit simplicity helps reduce power consumption [5]. P-FFs have already been used to improve performance in a number of studies [6]-[24]. P-FF construction includes a clock pulse generator (PG) and a single latch structure.

In master slave FF (MS-FF) architecture, two latch stages are required, a master latch and a slave latch whereas in P-FF only one latch is required thereby reducing the complexity of FF designs. The P-FF performs like a MS-FF with minimal time overhead, if the clock pulse width is sufficiently narrow enough. With a single latch, time borrowing is possible with negative setup time. P-FFs are also less sensitive to clock jitter. Additionally, its straightforward design minimizes area overhead and power loss.

Depending on how the clock PG is implemented, P-FFs can be categorized as either an implicit type or an explicit type [22]. The PG frequently contains a delay chain in explicit types, which considerably increases clock signal power. Even several P-FFs can use the same PGs, in these situations pulse width control problems are trickier. In implicit types, the clock discharge route is controlled to create the PG. Longer discharge routes, however, is a common drawback of this design and might compromise performance. For both P-FF kinds, balancing performance improvement and power consumption reduction is difficult.

First off, implicit P-FF (iP-FF) is frequently seen as being more energy-efficient than explicit P-FF (eP-FF) since it just needs to manage the discharge clock branches as opposed to the latter, which also needs to separately create a pulse. Second, the pulse generators on eP-FFs can be shared by nearby FFs, allowing for a more even distribution of the overhead power from the pulsegenerating stage across all FFs [25].

To avoid pulse distortion, gated clock technique should be used in explicit types but the gating tasks of numerous latches should be comparable and the physical proximity of the generated pulse to its latches is required. It is also mandatory to take into account the capacitive load of the PG whilst the pulse from clock signal to latches is delivered [26]. However, some of the capabilities of an iP-FF can significantly mitigate these issues. The gated clock blocks the unwanted clock transitions from happening as and when required. The generated signal is pulses usually at input data transitions and the technique is known as embedded gated clock [27]. In such cases, the overhead created must be kept to a minimum because each flip flop has its own gating logic.

In this work, six latest state of art pulse triggered flip flops are rigorously studied for power consumption, delay constraints, and area overhead. An extensive and detailed comparison is carried out for the same. Section 2 discusses the latest P-FFs in depth. Section 3 presents the comprehensive simulation analysis results and discussions for optimal use of flip flop as and where required in digital CMOS circuits.

## 2. REVIEW OF STATE-OF-ART P-FF

Researchers over the last decade proposed several flip flop designs based on MS-FF and P-FF. This work discusses the recent state of art pulse triggered FFs architectures. The Fig.1 proposed by Hwang in [19] known as Conditional Pulse Enhancement P-FF (CPEPFF) is the first example. To address the issues with traditional P-FF designs, this design uses two strategies. The first involves fewer NMOS transistors being placed in the route of discharge. Second, when the input data is "1," assisting a technique to increase the pull down strength conditionally. Transistor N4 is not included in the discharging circuit, in contrast to the transistor stacking architecture in traditional FFs. Transistor N4 controls the discharge of transistor N6 by forming a 2-input pass transistor logic based AND gate with transistor N5. The output node X is often retained at "0" since the two inputs to the AND logic are typically complimentary (with the exception of when the clock is transitioning). During the falling edges (highto-low) of the clock signal (CLK), temporary floating at node X is essentially safe when both input signals are equal to "0". Transistors N4 and N5 work together to turn ON transistor N6 by turning ON transistor N4 for a period of time determined by the delay inverter INV2 at rising edges of the CLK. The smaller voltage swing might lead to a reduction in the switching power at node X. The control signal (for discharge) is driven by two nMOS transistors (N4 and N5) in parallel, which speeds up the pulse generation process in contrast to conventional systems where a single transistor drives the discharge control signal. This design decision reduces the number of stacked transistors along the discharge path and allows for smaller transistors N1-N5.



Fig.1. Architecture of CPEPFF [19]

When both the input data and QB output are "1," the longest discharge path is created. For the same, transistor P3 is utilized to improve the discharge under this circumstance. Node Y is often pushed high, which causes transistor P3 to be off most of the time. When node Y discharges below the Vdd, it intervenes. The produced pulse is longer, which strengthens transistor N6's ability to pull down. To stop the discharge route after the clock's rising

edge, the delay inverter INV2 forces node X back to zero by transistor N5. As Node Y's voltage level increases, transistor P3 finally becomes unresponsive. The size of the created discharging pulse is widened by P3's action. This indicates that a large delay inverter design, which accounts for the majority of the power dissipation in PG logic, is not necessary to generate a pulse with a suitable size aimed at accurate data collecting. It should be emphasized that the FF output must undergo a data change from 0 to 1 in order for this conditional pulse augmentation approach to work. Compared to other systems that employ an indiscriminate pulsewidth augmentation technique, this design offers improved power performance. The decrease in leakage power brought on by the shrinking of the transistors in the delay inverter and the essential discharge circuit is another advantage of this conditional pulse enhancement system.

The Fig.2 proposed by [20] known as Signal Feed-through P-FF (SFTPFF) adopts a technique, as in the name, known as "signal feed-through" to improve delay. In order to prevent unnecessary switching within the nodes, the design makes use of a conditional discharge technique and a latch structure which is static. However, this design stands out from the competition due to three key distinctions that result in a special true single-phase clock (TSPC) latch structure. Transistor P1 is used as a weak pull-up transistor whose input is always connected to ground terminal thereby it conducts all the time. A pseudo-NMOS logic architecture results from this, and from node Y a keeper circuit may be preserved. This strategy not only simplifies the circuit but also lowers the node X load capacitance. Second, the inclusion of a pass transistor N2 driven by the pulse clock (CLKP) allows input data to directly drive node Q of the latch. This additional route makes it easier to drive the input signal to the output node Q together with the help of transistor P2 which acts as a pulldown transistor at the TSPC latch's second stage inverter. Thus, it is possible to swiftly draw up the node level to reduce the data transmission time. Third inverter's pull-down network is totally removed, and the pass transistor N2 provides a path for the signal to discharge completely. Thus, N2 has a dual purpose: it drives node Q more intensely during "0" to "1" input transmission and discharges the output Q during "1" to "0" transmission. An NMOS pass transistor was included as a second component in this design to facilitate signal feedthrough. By employing this scheme, the "0" to "1" delay is improved and thereby minimizing the effects of fall time and rise time delay.

If there is no data transition with the arrival of a clock pulse, i.e., the output Q and the input signal D are both equal, current will flow via the pass transistor N4, which in-turn prevents the input stage from being driven. The pull-down route of node Y is not ON because the input signal D and output feedback signal QF take up complimentary signal levels. Therefore, none of the internal nodes swap signals. In contrast, node Y is discharged in response to a "0" to "1" data transfer, activating transistor P2, which subsequently pushes node Q high. The discharge route conducting just for the period of a pulse is the worst-case scenario for FF timing activities. Nevertheless, the delay may be significantly reduced by using the signal feedthrough technique, which obtains a boost because of the presence of transistor N4. It must be noted here that the input signal is burdened with the responsibility of charging and discharging, but the same happens for a very small period of time as transistor N4 conducts for short durations. Transistor N4 is also activated by the clock pulse for a "1" to "0" data transfer, and output is discharged over this pathway. Unlike the "0" to "1" transfer, the input signal is solely responsible for discharging. Since N4 is only activated for a brief period of time, the loading effect on input signal is minimal. Specifically, no changes to the transistor size are necessary to increase the speed of this discharge since it does not correlate to the critical route delay. Additionally, because of the keeper circuit located at output Q, once the keeper logic's state is reversed, the input source's responsibility to discharge is released.



Fig.2. Architecture of SFTPFF [20]

Another example of pulse triggered FF is the Karimi's P-FF (KPFF) proposed in [21] and shown in Fig.3. The architecture is again constructed on signal feed through method. The pulse generator, the input data, and the input signal together with the transmission gate controls transistor P2. The PG employs just four transistors (P1, N2-N4), which contributes to a decrease in the device's power consumption when compared to other PGs. PG in [20] employs a number of inverters to produce latency, but this design makes use of an upgraded inverter. When the CLK is "0", transistor P1 is ON while N2, N3, and N4 transistors all are in the OFF phase. However, transistor P1 is what changes the state of node X. Transistor N4 immediately goes ON at rising edge of CLK, followed shortly by transistors N2 and N4. The transistors N2 and N3 would eventually be utilized as pull-down transistors, aiding in the discharge of the node X. As a result, the CLKP is set to "0" and is discharged But because of transistors N2 and N3's delay, a pulse will be created at the same node CLKP sufficient to drive other transistors. Also, the size of the transistors N2 and N3 should be smaller then N4, so as to assure, to pass a minimum clock pulse over N4 prior to discharging node X. Additionally, if the output of the FF does not change upon arrival of a clock pulse, which would indicate that there has been no transmission and the current would just flow through N6. It is because of the fact that Q and data D have identical value, which indicates that the input data wouldn't need power at all to drive node Q. When a transmission from "0" to "1" takes place, the output Q will be high and node Y will be discharged. The improvement in leakage power occurs whether the design is in sleep or active mode. This is a result of the input's utilization of transmission gates. Therefore, the transistor N7 would be turned OFF when nodes Q and Y are switched, which would aid in reducing the additional

and needless leakage power. These changes have the potential to decrease power loss, increase speed of operations, and mainly reduce leakage power. Improvements made in the input data control signal to reduce the leakage power also helped in increasing the speed of operations particularly when there is a transition from high to low, this is because of the secondary discharging through N1. The transistor N1 is additionally switched ON, providing node Q with an additional discharge channel, as soon as the output value changes from "1" to "0." This would speed up the process of discharging node Q. Because of the adjustment made to the pulse generator, the PG circuit is made simpler, using fewer transistors, taking up less space on the PCB, and using less power from the FF. The last change is made to transistor P2. This pull-up transistor is controlled by number of signals; however, the power consumption is greatly decreased by coupling transistor P2 to the input data. As a result, transistor P2's ON-state duration is shorter.



Fig.3. Architecture of KPFF [21]

Pan in [22] proposed a P-FF known as Conditional Feedthrough P-FF (CFTPFF) shown in Fig.4. Because there is a trade-off between speed and power consumption, traditional P-FFs are primarily designed for power dissipation or delay constraints, with minimal emphasis on energy economy. Unstable pull-down and pull-up routes in the typical architectures generate a longer low to high delay in particular the latency from D-to-Q, which lowers the circuit's energy efficiency. As already noticed in Fa's [20], the clock circuit PG requires a delay chain resulting in excessive power, is frequently used in designs. Additionally, needless switching takes place within the architecture when the input is changed, which further uses extra dc power. Each of these elements reduces the circuit's effectiveness. This design concurrently tackles two efficiency problems: First the number of transistor stacks is decreased in pull-down route, speed is enhanced; second extra pull-down and pull-up pathways (P4, N6) are introduced. The discharge route and feedthrough techniques, along with transistor reordering, greatly lowered D-to-Q delay. A transmission gate is used instead of pass transistor, in contrast to conventional feedthrough methods. Due to the increase in feedthrough efficiency without threshold loss, the output voltage

step during a low to high transmission was removed, and the delay was decreased. Output feedback was used to manage the transmission gate, preventing needless turn-on, further conserving power. To reduce charge sharing induced by the feedthrough transistor, a second discharge channel is also incorporated into the design. In order to prevent repeated precharging and wasteful internal node switching, an output feedback keeper and a static latch were utilized.



Fig.4. Architecture of CFTPFF [22]

The P-FF structure of the TSPC offers four key improvements over previous systems. Firstly, the discharge transistor controlling the clock (N1) is linked to discharge transistor (N2) controlling the data which is near to the ground terminal, in contrast to normal P-FF discharge routes. The predischarge that emerges from this rearrangement of stacked discharge transistors shortens the D-to-Q time for both 0-to-1 and 1-to-0 output transitions. Secondly, the addition of an output-controlled transmission gate allowed the signal from input to be sent straight at output (TG, N3 and N4). Thirdly, to improve design-driving capabilities, transistors N5 and N6 introduced. Lastly, to reduce the clock power and area, a shared width programmable PG and clock mesh architecture were adopted.

The Fig.5 is another example of P-FF proposed by Absel in [23] known as dual dynamic node hybrid FF (DDFF). Node X is pseudo dynamic whereas node Y is purely dynamic with a keeper circuit as inverter and is weakly driven. A mechanism for unconditional cutoff is offered here. The precharge phase, which occurs when CLK is "0" and the evaluation phase, which occurs when CLK is "1", are the distinct stages of this flip flop. For assessment phase, the 1-1 overlap of CLKB and CLK is when the actual latching takes place. Terminal X via transistor N1-N3 is discharged in the event D is high before to this overlap time. The cross-coupled inverter pair INV2 and INV3 changes state as a result, causing output QB to discharge through N5 and node XB to go high. For the remainder of the evaluation period, during which there is no latching, the inverter pair INV2 and INV3

maintains the low level at the node X. Thus, the PMOS transistor P2 maintains node Y high during the assessment time.



Fig.5. Architecture of DDFF [23]

In the precharge phase, when the CLK drops low, node X is pushed high through P1, changing the state of INV2 and INV1. Node Y saves the charge dynamically at this time because no transistor is currently driving it. Through INV4 and INV5, the outputs at nodes QB and keep their voltage levels. Terminal Y is pushed to "0" and terminal X remains "1" by N4 as the CLK becomes "1" if D is low just before the overlap time. As a result, N5 is postponed and node QB is charged highly by P3. As the CLK decreases at the conclusion of the assessment phase, node X continues to be high while Y dynamically stores the charge. The design displays negative setup time because the data may be sampled even after clock changing from low to high before CLKB changes to low due to the brief transparency duration specified by the 1-1 overlap of CLKB and CLK. Charge sharing occurs on Node X when the clock transitions from "0" to "1" while D is "0". At node X, there is a brief reduction in voltage as a result of this, but because the inverter pair INV2 and INV3 is skewed correctly, its switching threshold is far lower than the worst-case voltage drop at node X as a result of charge sharing. Node Y maintains its charge level as was demonstrated in the timing diagram even though no transistor is driving it during the precharge period. It should be noted that the delay between nodes X and XB is what causes the brief pull down at terminal Y while sampling a "1". The minimal time after and before the CLK edge, during which the input must be steady for appropriate sampling, is referred to as hold time and setup time of FF. Here, the CLK overlap duration affects the setup and hold times. The conditional shutdown method is reliable in general. By skewing the NAND gate and the inverters in the conditional shutoff circuit, it is easy to produce narrower sampling windows. This approach results in a bigger precharge node capacitance and thereby may result in high power dissipation, even if it can reduce hold time requirements. As a result, the unconditional cutoff employed in this architecture offers a straightforward and power saving procedure at the

expense of a marginally complex design procedure. The switching threshold of INV2 and INV3 establishes the worst-case hold time. A shorter overlap period and a higher switching threshold led to a lower hold time need.

In [24], Geng proposed an implicit P-FF, shown in Fig.6, known as "Dual-edge iP-FF with an embedded clock-gating scheme (DIFF-CGS)". Two components make up the DIFF-CGS schematic: a static latch and the implicit pulse generation stage with a built-in gated clock mechanism. The adaptive clocking inverter chain of the DIFF-CGS has a control circuit that has the capacity to evaluate and suppress redundant delayed clock signals. This control circuit is used to perform the clock-gating scheme. This design uses a transmission gate logic (TGL) based comparator to achieve the clock-gating technique, considerably enhancing the device's resilience.



Fig.6. Architecture of DIFF-CGS [24]

It ought to be noted here that because of the design's implicit pulse feature, which places the pulse's latch physically close to the pulse, pulse alteration may be prudently evaded. This makes it simpler to maintain the clock's form when transferring signal to the flip flop. *CLK*, *CLKc*, *CLKa*, *CLKd* are pooled at latch end to ensure the effectiveness of "Double-edge clock triggering" in an implicit context. The benefit of this sharing arrangement is that fewer clock transistors are used, which significantly reduces power consumption. Since all unneeded pulses are suppressed, there are no excess transitions at internal node X, unlike other P-FF latches, necessitating the conditional discharge approach. Transistor P3 whose gate is connected to the ground potential, a weak PMOS, is used here instead of a keeper circuit. For the evaluation period when the input changes from low to high, small short circuit current is produced thereby keeping the discharge path ON for a very small period. This produces very little short circuit power. Then, by use of *CLKc* or *CLKd*, the discharge route is turned off. Additionally, the output keeper (cross-couple inverters) offers feedback to the implicit pulse generating step as well as protection against direct coupling noise.

The following is an explanation of the FF's operating concept. During the time when the input and output differ, the comparator which is a TGL based output Y goes high which in turn switches the PG transistor N2 and N3 ON and transistor P2 OFF. The clocked transistors N6 and N7 are then controlled by the necessary delayed and inverted clock signals (CLKc and CLKd). CLKc will briefly go to logic "1" on the 0-1 transition of the CLK, turning ON the transistors N6 and N8 (as CLK is also high). For the 1-0 transition of the CLK, transistor N7 and N9 will briefly switch ON. As a result, whenever either clock branch operates, the FF is in an assessment phase. Now if the input signal D is high or changes from 0-1, node X will be dropped to the ground potential via either N6 and N8 or N7 and N9 branches subsequently turning node Q high via transistor P4. If the input data signal D is low or changes from 1-0, P4 is OFF, node Q is driven to ground potential via the transistor N5 and either one of the clock branches. Now if the input and output are the same, the output of the comparator will be zero turning transistor N2 and N3 OFF. P2 is turned ON, node Z goes to high potential and CLKd to low. As a result, a significant amount of power is saved since the clocked transistors N6 and N7 are switched off by CLKc and CLKd, and the state of flip flop stays constant until a change at the input occurs again. The clock PG is deactivated and the delayed clock signals are suppressed, which results in less needless charging and discharging of the clocked transistors if the input D remains constant. Terminal X is preserved at logic "1" in this situation because both clock branches are turned off, preventing duplicate transitions. Additionally, the inbuilt gated clock approach reduces the size of the PG chain, improving the design's power and delay performance and saving layout space. As a result, when data activities are minimal, DIFF-CGS displays a low-power feature.

## 3. SIMULATION RESULTS, PERFORMANCE ANALYSIS AND DISCUSSIONS

The pulse triggered flip flops discussed in section 2 are simulated using T-Spice in 32 nanometres CMOS technology node. The nominal operating conditions are 500MHz clock frequency at 25 °Celsius temperature with 50% data activity. The data word length used is 16 bits and the operating supply voltage is 1.3V.

| Flip Flop                              | [19] | [20]  | [21]  | [22]  | [23]  | [24]  |
|----------------------------------------|------|-------|-------|-------|-------|-------|
| Avg. power at nominal conditions (uW)  | 7.51 | 7.72  | 6.3   | 4.54  | 7.6   | 10.67 |
| RMS power at nominal conditions (uW)   | 51.4 | 80.9  | 43.98 | 45.74 | 79.48 | 81.69 |
| CLK-Q delay at nominal conditions (pS) | 9.96 | 18.44 | 20.36 | 29.65 | 1.72  | 18.33 |
| Optimal PDP (aJ)                       | 74.8 | 142.4 | 128.8 | 134.6 | 13.07 | 195.6 |
| No. of transistors                     | 19   | 24    | 17    | 18    | 18    | 31    |
| Sum of widths (um)                     | 7.77 | 8.8   | 4.98  | 4.16  | 8.4   | 10.92 |

Table.1. Performance comparison of various FF designs

The Table.1 shows the power, delay, PDP results and area requirements of all the FFs taken into account in this study at nominal operating conditions. It is found that the Pan's CFTPFF [22] consumes the least average power followed by KPFF [21] whereas DIFF-CGS [24] is least power efficient. In terms of RMS power, the KPFF beats CFTPFF marginally. Among all the flip flops DDFF [23] has the fastest speed of operation whereas CFTPFF is the slowest. KPFF and CFTPFF are also the two most area efficient flip flops.

Table.2. Average power at variations in frequency



(b)



Fig.7. Average power at variations in data activity (a) at 100MHz, (b) at 500MHz, (c) at 1GHz, and (d) at 2GHz

In order to know the power efficiency of these flip flops, four test patterns are utilised to simulate different conditions. These test patterns are the clock frequency variation at 100MHz, 500MHz, 1GHz and 2GHz. Further the different situation at these frequencies are the data activity probabilities of 100%, 75%, 50%, 25%, 12.5%, 0% (all data high) and 0% (all data low).

The Table.2 shows the power results of all the flip flops at variations in frequencies from 100MHz-2GHz. Since P-FFs perform normally at higher frequencies and few only are functional at lower frequency, it is observed that KPFF and CFTPFF are non-functional at frequency of 100MHz whereas the other flip flops do perform normally. The Fig.7 is the detailed power analysis at different data activities at variations in the aforementioned clock frequencies. It must be noted here that the test is not conducted on KPFF and CFTPFF at 100MHz since they both are non-functional at this frequency. CFTPFF has a clear advantage in terms of power dissipation among all the FFs at different activities. At 100MHz CLK frequency, the CPEPFF has better power dissipation than others.

In order to know the performance and features of the flip flop in relation to speed of operation, two test patterns are utilised. First, the CLK-Q delay calculations are performed at 5 different temperatures of 0°C, 25°C, 50°C, 75°C and 100°C. Second, the CLK-Q delay measurements are carried out at a clock frequency range of 100MHz-2GHz.Table 3 shows the CLK-Q delay of all FFs at different temperature levels. It is observed that DDFF is the fastest pulse triggered flip flop at temperatures varying from 0°C-75°C whereas KPFF has slightly better speed at 100°C. CPEPFF is the second fastest pulse triggered flip flop. The Fig.8 shows the speed of operations of FFs at different frequencies at nominal operating conditions. The speed advantage of DDFF over all the other P-FFs is showcased here. Delay measurements at variation in frequencies also resulted in CPEPFF being the second fastest although at par with SFTPFF at maximum frequency.

|  | Table.3. CL | K-Q dela | y (in pS) | at variations i | in temperature |
|--|-------------|----------|-----------|-----------------|----------------|
|--|-------------|----------|-----------|-----------------|----------------|

| Temperature (°C) | 0     | 25    | 50    | 75    | 100   |
|------------------|-------|-------|-------|-------|-------|
| CPEPFF           | 8.81  | 9.96  | 10.3  | 11.93 | 13.12 |
| SFTPFF           | 18.08 | 18.44 | 11.36 | 12.07 | 23.5  |
| KPFF             | 15.7  | 20.36 | 21.07 | 21.96 | 10.4  |
| CFTPFF           | 23.19 | 29.65 | 29.6  | 27.73 | 28.65 |
| DDFF             | 0.62  | 1.72  | 3.18  | 3.93  | 13.6  |
| DIFF-CGS         | 16.21 | 18.33 | 20.25 | 21.65 | 24.06 |



Fig.9. PDP (a) at variations in temperature (b) at variations in frequency

The overall performance of the flip flops can be attained by the factor known as PDP. The less the value of PDP, the better it is to be employed in a digital CMOS circuit. Same two test patterns were utilised to calculate PDP as was done for speed of operations. The Fig.9 shows the result of these tests and showcases the worthiness of DDFF over all the other flip flops.

## 4. CONCLUSION

The in-depth analysis of P-FFs specified that DDFF consumes more power when data is high and less when data is low. At 100MHz clock frequency, CPEPFF is recommended for low power digital circuits whereas at all other frequencies the CFTPFF utilizes the least power. DIFF-CGS is not recommended for power constraint circuits. CFTPFF which consumes the least power has highest delay therefore recommended only for low power circuits and not in circuits where performance is a key parameter. For high speed applications and optimal designs, DDFF outperforms all other P-FFs. The area requirement of DDFF is also at par with KPFF if not the least.

## REFERENCES

- [1] I.A. Khan, O.A. Shah and M.T. Beg, "Analysis of Different Techniques for Low Power Single Edge Triggered Flip Flops", *Proceedings of World Congress on Information and Communication Technologies*, pp. 1363-1367, 2011.O
- [2] O.A. Shah, I. Ahmed Khan, G. Nijhawan and I. Garg, "Low Transistor Count Storage Elements and their Performance Comparison", *Proceedings of International Conference on Advances in Computing, Communication Control and Networking*, pp. 801-805, 2018.
- [3] C.Y. Kim and H.C. Lee, "Low-Power, High-Sensitivity Readout Integrated Circuit with Clock-Gating, Double-Edge-Triggered Flip-Flop for Mid-Wavelength Infrared Focal-Plane Arrays", *IEEE Sensors Letters*, Vol. 3, No. 9, pp. 1-4, 2019.
- [4] S.K. Kim, T.W. Oh, S. Lim, D.H. Ko and S.O. Jung, "High-Performance and Area-Efficient Ferroelectric FET-Based Nonvolatile Flip-Flops", *IEEE Access*, Vol. 9, pp. 35549-35561, 2021.
- [5] A. Amirany, K. Jafari and M.H. Moaiyeri, "High-Performance Radiation-Hardened Spintronic Retention Latch and Flip-Flop for Highly Reliable Processors", *IEEE Transactions on Device and Materials Reliability*, Vol. 21, No. 2, pp. 215-223, 2021.
- [6] A. Karimi, A. Rezai and M.M. Hajhashemkhani, "Ultra-Low Power Pulse-Triggered CNTFET-Based Flip-Flop", *IEEE Transactions on Nanotechnology*, Vol. 18, pp. 756-761, 2019.
- [7] P.A. Meinerzhagen, "Min-Delay Margin/Error Detection and Correction for Flip-Flops and Pulsed Latches in 10-nm CMOS", *IEEE Solid-State Circuits Letters*, Vol. 2, No. 9, pp. 147-150, 2019.
- [8] S. Luo, C. Huang and Y. Chu, "An Adaptive Pulse-Triggered Flip-Flop for a High-Speed and Voltage-Scalable Standard Cell Library", *IEEE Transactions on Circuits and Systems II: Express Briefs*, Vol. 60, No. 10, pp. 677-681, 2013.

- [9] M.W. Phyu, K. Fu, W.L. Goh and K. Yeo, "Power-Efficient Explicit-Pulsed Dual-Edge Triggered Sense-Amplifier Flip-Flops", *IEEE Transactions on Very Large Scale Integration* (VLSI) Systems, Vol. 19, No. 1, pp. 1-9, 2011.
- [10] R. Islam and M.R. Guthaus, "Low-Power Clock Distribution using a Current-Pulsed Clocked Flip-Flop", *IEEE Transactions on Circuits and Systems I: Regular Papers*, Vol. 62, No. 4, pp. 1156-1164, 2015.
- [11] E. Consoli, G. Palumbo, J.M. Rabaey and M. Alioto, "Novel Class of Energy-Efficient Very High-Speed Conditional Push–Pull Pulsed Latches", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 22, No. 7, pp. 1593-1605, 2014.
- [12] Y. Chuang, S. Kim, Y. Shin and Y. Chang, "Pulsed-Latch Aware Placement for Timing-Integrity Optimization", *IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems*, Vol. 30, No. 12, pp. 1856-1869, 2011.
- [13] K.C. Woo, H.J. Kang and B.D. Yang, "Area-Efficient Bidirectional Shift-Register using Bidirectional Pulsed-Latches", *IEEE Transactions on Circuits and Systems II: Express Briefs*, Vol. 66, No. 8, pp. 1386-1390, 2019.
- [14] B. Yang, "Low-Power and Area-Efficient Shift Register using Pulsed Latches", *IEEE Transactions on Circuits and Systems I: Regular Papers*, Vol. 62, No. 6, pp. 1564-1571, 2015.
- [15] H. Jeong, J. Park, S.C. Song and S.O. Jung, "Self-Timed Pulsed Latch for Low-Voltage Operation with Reduced Hold Time", *IEEE Journal of Solid-State Circuits*, Vol. 54, No. 8, pp. 2304-2315, 2019.
- [16] H. Lin, Y. Chuang, Z. Yang and T. Ho, "Pulsed-Latch Utilization for Clock-Tree Power Optimization", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 22, No. 4, pp. 721-733, 2014.
- [17] W. Jin, S. Kim, W. He, Z. Mao and M. Seok, "Near- and Sub- \$V\_{t}\$ Pipelines Based on Wide-Pulsed-Latch Design Techniques", *IEEE Journal of Solid-State Circuits*, Vol. 52, No. 9, pp. 2475-2487, 2017.
- [18] A. Yan, "Design of Double-Upset Recoverable and Transient-Pulse Filterable Latches for Low-Power and Low-Orbit Aerospace Applications", *IEEE Transactions on Aerospace and Electronic Systems*, Vol. 56, No. 5, pp. 3931-3940, 2020.

- [19] Y. T. Hwang, J.F. Lin and M.H. Sheu, "Low-Power Pulse-Triggered Flip-Flop Design with Conditional Pulse-Enhancement Scheme", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 20, No. 2, pp. 361-366, 2012.
- [20] J. Lin, "Low-Power Pulse-Triggered Flip-Flop Design Based on a Signal Feed-Through", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 22, No. 1, pp. 181-185, 2014.
- [21] A. Karimi, A. Rezai and M.M. Hajhashemkhani, "A Novel Design for Ultra-Low Power Pulse-Triggered D-flip-Flop with Optimized Leakage Power", *Integration The VLSI Journal*, Vol. 60, pp. 1-9, 2017.
- [22] D. Pan, C. Ma, L. Cheng and H. Min, "A Highly Efficient Conditional Feedthrough Pulsed Flip-Flop for High-Speed Applications", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 28, No. 1, pp. 243-251, 2020.
- [23] K. Absel, L. Manuel and R.K. Kavitha, "Low-Power Dual Dynamic Node Pulsed Hybrid Flip-Flop Featuring Efficient Embedded Logic", *IEEE Transactions on Very Large Scale Integration (VLSI) Systems*, Vol. 21, No. 9, pp. 1693-1704, 2013.
- [24] L. Geng, J.Z Shen and Cy. Xu, "Power-Efficient Dual-Edge Implicit Pulse-Triggered Flip-Flop with an Embedded Clock-Gating Scheme", *Frontiers of Information Technology and Electronic Engineering*, Vol. 17, pp. 962-972, 2016.
- [25] C.K. Teh, M. Hamada, T. Fujita, H. Hara, N. Ikumi and Y. Oowaki, "Conditional Data Mapping Flip-Flops for Low-Power and High-Performance Systems", *IEEE Transactions* on Very Large Scale Integration (VLSI) Systems, Vol. 14, No. 12, pp. 1379-1383, 2006.
- [26] S. Kim, I. Han, S. Paik and Y. Shin, "Pulser Gating: A Clock Gating of Pulsed-Latch Circuits", *Proceedings of Asia and South Pacific Conference on Design Automation*, pp. 190-195, 2011.J. Shen, L. Geng and X. Wu, "Low Power Pulse-Triggered Flip-Flop based on Clock Triggering Edge Control Technique", *Journal of Circuits, Systems and Computers*, Vol. 24, No. 7, pp. 1-15, 2015.