# Near-threshold adiabatic SRAM based on CPAL circuits with DTCMOS technique

# Beibei Qi, Jianping Hu<sup>\*\*</sup>, Chenghao Han, Yeliang Geng

Faculty of Information Science and Technology, Ningbo University, Ningbo, 315211, China

Correspondence to this author at 818 Fenghua Road, Ningbo, China

Received 25 October 2014, www.cmnt.lv

# Abstract

An adiabatic SRAM (Static Random access memory) operating in near-threshold region based on CPAL (Complementary pass-transistor adiabatic logic) circuits with DTCMOS (dual-threshold CMOS) technique is realized for low-energy applications. The SRAM using the CPAL circuits can recover the energy of the read driver, write driver circuit, word-line decoder, and sense amplifier in a fully adiabatic manner. The DTCMOS technique can effectively reduce the leakage energy consumption of the SRAM. In addition, near-threshold technique can not only greatly reduce dynamic energy consumption, but also satisfy the requirement of mid-performance systems. Modelling and sizing of adiabatic storage cells are constructed and analysed. The simulations for the function and energy consumption of the SRAM are carried out with a SMIC 130nm CMOS process. The HSPICE simulation results show that the SRAM has ideal logic function and low energy consumption.

Keywords: Static random access memory, complementary pass-transistor adiabatic logic, Dual-threshold CMOS, near-threshold, low-power designs

#### **1** Introduction

The development of very large scale integrated circuits (VLSI) leads to a decrease in size of device and an increase in transistor density. The power consumption problem has become a key one in the development of integrated circuits. Therefore, limiting the energy consumption and raising energy efficiency in a circuit need to be considered [1]. In many application systems, charging and discharging for capacity buses generate great energy consumption. The demand of low-power SRAM circuit is becoming increasingly much [2, 3]. The low-power design of the SRAM is mainly to lower the dynamic and leakage energy consumption.

Conventional CMOS integrated circuits employ DC power source, so that the energy is always consumed irreversibly from electric energy to thermal one. The energy consumption of the conventional CMOS circuits is composed of the dynamic energy consumption caused by the charging and discharging of node capacitance, the short-circuit consumption caused by circuit flip, and the leakage consumption caused by the leakage current of MOS transistors. In the designs of the conventional circuits, the methods to reduce power dissipation include lowering the source voltage, and reducing the load capacitance and the switching of the gates.

The adiabatic circuits use the AC power source to recover the electric charge of the node capacitance to the power source, and realize the reuse of the energy. The AC power source slowly changes from the high (or low) to low (high) voltage to charge and discharge the node capacitance, so that the thermal energy can be eliminated. An adiabatic logic family named as CPAL (complementary pass-transistor adiabatic logic) works in a fully adiabatic manner on the output loads, since there is no non-adiabatic energy consumption on the output load [4, 5].

The DTCMOS (dual-threshold CMOS) technique uses high-threshold CMOS transistors in the non-critical path of the circuit to reduce the sub-threshold leakage current [6-8], and uses low-threshold transistors in the critical path of the circuit to guarantee the performance of the whole circuit. In addition, a simple and effective way to reduce energy consumption is to reduce source voltage. In recent years, nearthreshold technique [9-11] is proposed. Energy consumption is significantly reduced and delay is inconspicuously increased when the voltage is ranging from VDD to threshold voltage (Vth).

In this work, an adiabatic 32×32 SRAM circuit based on CPAL circuits with DTCMOS technique is realized. It reduces both dynamic energy consumption and leakage consumption. In the proposed SRAM, a four-phase sine wave clock without overlapping is used. Energy recovery and near-threshold techniques are used to reduce dynamic energy consumption. The DTCMOS technique is used to reduce leakage consumption. Except for the storage cell array, the read driver circuit, the write driver circuit and addresser decoder use the CPAL circuits to reduce dynamic energy consumption. The combination of DTCMOS technique and adiabatic circuits reduces both dynamic energy consumption and leakage consumption of the SRAM circuit. In the adiabatic SRAM, the word line and bit line of the storage cell are driven by adiabatic signals, which impacts the read and write operations of the storage cell. Modelling and sizing of adiabatic storage cells are also constructed and analysed [12].

The simulation results verify that the energy consumption of the CPAL SRAM is less than that of the conventional SRAM, and the energy consumption of the adiabatic SRAM with DTCMOS technique is reduced compared with the basic SRAM one. From the simulation result of the designed SRAM circuit, when the operating voltage is ranging from 0.9V to1.2V, the SRAM circuit has correct functionality and appropriate operating frequency.

<sup>\*</sup> Corresponding author email: hujiangping2@nbu.edu.cn

#### 2 CPAL Circuits and Design of SRAM

# 2.1 CPAL CIRCUITS

The CPAL buffer/inverter is shown in Figure 1 (a). It has two main components: the logic assignment circuit and the energy recovery circuit. The energy recovery circuit is composed of a pair of transmission gates (N1 P1 N2 P2). The logic assignment circuit is composed of N5-N8 in the form of CPL (complementary pass transistor logic). The transistor N3 and N4 clamp the output node voltage. The powerclock of CPAL buffer/inverter circuit is single-phase powerclock, which is a sinusoidal waveform.

Cascaded CPAL gates are driven by four-phase powerclocks that are shown in Figure 1 (c). In order to form a chain of logic circuits, a clocking rule must be followed. The previous clock is followed by the next clock with a 90 phase shift for a complete pipelined operation. The buffer chain and its simulation waveform are shown in Figure 1 (b) and Figure 1 (d). The frequency of the power-clock is 100MHz, and the peak voltage is 1.2V. Complex gates can be realized by replacing N5-N8 in the form of CPL. All basic gates have the same topology. This makes circuit design simply and modeled. The formula of CPAL buffer circuit energy consumption is:

$$E_{CPAL} = C_X (V_{DD} - V_{Tn}) V_{Tn} + \frac{1}{2} C_X (V_{DD} - V_{Tn})^2 (2RC_L / TC_L V_{DD}^2),$$
(1)

where ECPAL is the energy consumption of the CPAL buffer/inverter circuit, CX represents the capacitance of the X node, VDD means the DC source voltage, VTn means the threshold voltage of NMOS transistor, R represents the resistance of transmission gate, CL represents the load capacitance, and T is the operation cycle.

In the t1-t2 time, the clk1 is from 0 to 1.2. The N1 is conductive, and then the out1 is rising following the clk1. When the voltage of out1 is greater than the VTn, the N4 is conductive, making out1b to be zero. When the voltage difference between the out1 and clk1 is less than VTp, P1 is conductive, and then the output out1 is charging to VDD. In the t1-t2 time, the node capacitance of X makes the response of the node X to be a pulse that has the same phase with the power-clock clk1.

In the t3-t4 time, the clk1 is from 1.2 to 0. The electric charge stored in the output node is recycled to the power source through the conductive transmission to achieve energy recovery.

The adiabatic circuits are composed by two types: fulladiabatic logic and quasi-adiabatic logic. The CPAL circuits works in a fully adiabatic manner. The PAL-2N, 2N-2N2P circuits are the quasi-adiabatic logic. The full-adiabatic circuit consumes less energy than the quasi-adiabatic circuit because the full-adiabatic circuits do not consume non-adiabatic energy. The energy consumption of the different buffer chains is shown as Figure 2. It is concluded that: the CPAL buffer chain working in a fully adiabatic manner consumes less energy than the PAL-2N, 2N-2N2P, and conventional CMOS buffer chain.





FIGURE 1 CPAL buffer/inverter circuit, buffer chain circuit, nonoverlapping four-phase power-clock, and the functional simulation waveform of the buffer chain circuit



FIGURE 2 Energy consumption among CPAL, PAL-2N, 2N-2N2P, and conventional CMOS buffer chain circuit

# 2.2 DESIGN OF SRAM

The structure of the SRAM circuit is shown in Figure 3.

#### Qi Beibei, Hu Jianping, Han Chenghao, Geng Yeliang



Storage cell has many types, such as 6T, 7T, 8T and 10T. 8T storage cell is used for the SRAM design. The 8T storage cell circuit is shown in Figure 4. The 8T storage cell can eliminate the conflict between the cell current and read stability in 6T storage cell. The RWL is the read word line, the WWL is the write word line, the RBL is the read bit line, and the WBL is the write bit line. Because the adiabatic driver circuit is dual-rail, the storage cell is also dual-rail. Two back-to-back inverters and four CMOS transistors generating read/write bit line and read/write word line are used in the storage cell circuit.



The adiabatic circuit uses the AC power-clock instead of DC power source. This will make a great difference for the storage cell. The width to length ratio of storage cell is deduced as follows:

#### 2.3 READ AND WRITE OPERATION

#### 2.3.1 Read operation

In the reading process, the RBL and RBLb are pre-charged high. The equivalent circuit in the read operation process is shown as Figure 5. 1.2V is supposed to be stored in the storage cell, then the VQ=1.2V, VQb=0. Form Figure 5, it is cleared that when the VQ =1.2V, P1 N2 N3 and N4 is on. The relation curve of CR and  $\Delta V$  is shown in Figure 6. The RBLb discharges though the N4 and N2, while the RBL is still high. The PD (potential difference) between the RBL and RBLb is:

$$\Delta V = V_{RBL} - V_{RBLb} \,, \tag{2}$$

Where  $\Delta V$  is the potential different between the RBL and RBLb, the VRBL represents the voltage of the RBL, and the

VRBLb represents the voltage of the RBLb.

There is a big bit-line capacitance in the RBLb, which is pF range. The size of the transistor hope to be close to the minimum size, but it will make the bit-line capacitance discharge slowly. When the potential difference between RBL and RBLb generates, the sense amplifier starts to work. When the RWL is rising, the voltage VQb between N4 and N2 is pulled up to the VDD. The voltage VQb can not raise too high, which lead current through the P1-N1 inverter. The current even can make the storage cell roll over. To avoid the read destructive fault, it is necessary to make the resistance of N4 bigger than that of N2.



FIGURE 5 Simplified model of CPAL SRAM storage cell during read process, VQ=1.2, Vprecharge =VDD



FIGURE 6 Voltage rise changing from the CR

The boundary constraints on the device sizes can be derived by solving the current equation at the maximum allowed value of the voltage ripple  $\Delta V$ . We ignore the body effect of N4 for simplicity and write. The current equation is:

$$K_{n,N4}[(clk \ \Delta V \ V_{Tn})V_{DSATn} \ \frac{V_{DSATn}^2}{2}]$$
  
=  $K_{n,N2}[(V_{DD} \ V_{Tn})\Delta V + \frac{\Delta V^2}{2}]$ , (3)

where  $CR = \frac{W_2 / L_2}{W_4 / L_4}$  is called as the cell ratio, clk is the

power-clock of the CPAL circuit, VDSATn means the saturation voltage of NMOS transistor, un represents the carrier mobility of the NMOS transistor, Cox represents the gate capacitance per unit area, Wn represents the channel width of the NMOS transistor, Ln represents the channel length of the NMOS transistor, W2/L2 represents the width to length ratio of N2 in the Figure 5, and W4/L4 represents the width to length ratio of N4 in the Figure 5.

#### 2.3.2 Write operation

In the writing process, the equivalent circuit is shown as Figure 7. When VQ=1.2, P1 N2 N5 and N6 is on. The relation curve of cell voltage (VQ) and PR is shown as Figure 8. External writing data converted into a complementary double-end signal is assigned to the WBL and WBLb.

When 0 is written to the storage cell, the WBL is set high voltage, and the WBLb is set low voltage. If the VQ can keep under the Vth (about 0.3V) in the process of the WWL rising to VDD, it will guarantee the write reliability of the storage cell. Formula (5) can satisfy this condition:

$$K_{n,N5}[(clk V_{Tn})V_{Q} \frac{V_{Q}^{2}}{2}] = K_{p,P1}[(V_{DD} V_{Tp})V_{DSATp} + \frac{V_{DSATp}^{2}}{2}],$$
(5)

$$V_{\rm Q} = V_{\rm DD} \ V_{\rm Tn} \ \sqrt{(clk \ V_{\rm Tn})^2} \\ \sqrt{2 \frac{u_{\rm p}}{u_{\rm n}}} PR[(V_{\rm DD} \ V_{\rm Tp})V_{\rm DSATp} \ \frac{V_{\rm DSATp}}{2}], \tag{6}$$

Where VQ is the cell voltage in the Figure 7, VDSATp means the saturation voltage of PMOS transistor, and up represents the carrier mobility of the PMOS transistor.

PR (pull-up ratio) is defined as the size ratio between the PMOS pull-up and the NMOS pass transistor.

$$PR = \frac{W_1 / L_1}{W_5 / L_5},$$
(7)

Where W1/L1 is the width to length ratio of P1 in the Figure 7, and W5/L5 means the width to length ratio of N5 in the Figure 7.

When the PMOS pull-up and the NMOS pass transistor use the smallest size, it will satisfy the above constraint condition. The dependence of the cell voltage (VQ) on PR is plotted in Figure 8. From Figure 8, it is concluded that: the pull-up ratio must be less than 1.6.





FIGURE 7 Simplified model of CPAL SRAM storage cell during write process, VQ=1.2



FIGURE 8 Cell voltage changing from the PR

From the above, we design the width to length ratio of the storage cell as the follow table 1:

TABLE 1 the size of storage cell with the 130 nm technology

| Width to<br>Length | Storage cell CMOS transistor |               |               |
|--------------------|------------------------------|---------------|---------------|
| transistor         | P1                           | P2            | N1            |
| W/L                | 210 nm/130 nm                | 210 nm/130 nm | 280 nm/130 nm |
| transistor         | N2                           | N3            | N4            |
| W/L                | 280 nm/130 nm                | 210 nm/130 nm | 210 nm/130 nm |
| transistor         | N5                           | N6            |               |
| W/L                | 210 nm/130 nm                | 210 nm/130 nm |               |

# 2.4 WORD-LINE DECODER AND READ/WRITE DRI-VER CIRCUIT

Figure 9 shows the word-line decoder circuit. The word-line decoder is a secondary decoder similar to the conventional word-line decoder circuit, which can generates 32 word-lines controlling the timing of the read and write operation. The pre-decoder is composed by 5-bit address decoder, which can generate the WL0...WL31. The second decoder is composed by 32 two-input AND gates. Read word lines RWL0...RWL31 and write word lines WWL0...WWL31 are generated by 32 AND gates with the two output of the read/write enable signals (RE WE) and the WL0...WL31. The write driver circuit is similar to the CPAL buffer as shown in Figure 1 (a). The read driver circuit is composed by the sense amplifier circuit and CPAL buffer circuit. The CPAL word-line decoder with DTCMOS technique has the same structure with the basic CPAL word-line decoder.

The energy consumption comparison between the CPAL word-line decoder and the conventional CMOS word-line decoder is shown in Figure 10. Form the Figure 10, it is clearly proved that the energy consumption of CPAL word-line decoder circuit is less than that of the conventional word-line decoder.

#### 3 Simulation result and near-threshold charac-teristics

Four-phase power-clock as the ideal power-clock is applied to the HSPICE simulation of SRAM circuit. The functional simulation waveform is shown as Figure 11. When the word-line decoder is prepared for writing data, WWL goes high, and WD starts writing to the storage cell, generating WBL. When the word-line decoder is prepared for reading data, RWL goes high, and the sense amplifier reads data through the potential difference between RBL and RBLb, generating RD. CELL is the voltage of the storage cell.

A low-power CPAL SRAM with DTCMOS technique working in near-threshold region is designed. The nearthreshold characteristic of the CPAL SRAM circuit with DTCMOS technique is analysed in the Figure 12. The energy consumption per cycle at max frequency with different operation voltage of CPAL SRAM circuit with DTCMOS technique is shown in Figure 12.

From the analysing of the near-threshold characteristics of SRAM circuit, it is concluded that when the operation voltage is 0.9V-1.1V, the frequency of the SRAM circuit is 75MHz-200MHz. The CPAL SRAM with DTCMOS technique cannot only have correct work timing, but also reduce the energy consumption. The energy consumption reduces 62% when working voltage being 1.0V.



FIGURE 10 Energy consumption of CPAL word-Line decoder and conventional CMOS word-line decoder





FIGURE 11 Simulation waveform of SRAM circuit



FIGURE 12 Energy consumption per cycle at max frequency with different operation voltage of CPAL SRAM with DTCMOS technique

#### 4 Adiabatic SRAM Circuit with DTCMOS Tech-nique

The DTCMOS technique is to use high-threshold transistor on the non-critical path of the schematic and low-threshold transistor on the critical path of the schematic. The DTCMOS technique can reduce the sub-threshold leakage current through increasing the threshold voltage of the N3 and N4 in the Figure 13, and then reduce the leakage consumption. Except the N3 and N4 using high-threshold CMOS transistor, the CPAL buffer with DTCMOS technique is similar to the basic CPAL buffer circuit.



FIGURE 13 CPAL buffer with DTCMOS technique

The application of DTCMOS technique can effectively reduce leakage current then to reduce leakage energy consumption. The structure of CPAL SRAM with DTCMOS technique is similar to the basic CPAL SRAM circuit. It is concluded from Figure 14. The energy consumption of CPAL SRAM with DTCMOS reduces 14% compared with the basic CPAL SRAM at frequency 100MHz.



FIGURE 14 Energy consumption among basic CPAL SRAM circuit, conventional CMOS SRAM circuit, conventional CMOS SRAM circuit with DTCMOS technique, and CPAL SRAM circuit with DTCMOS technique

#### References

- Auth C, Allen C, Blattner A, Bergstrom D, and et al Proceedings of 2012 Symposium on VLSI Technology(VLSIT) 2012 131-132
- [2] Bellerimath P S, Banakar R M International Journal of Current Engineering and Technology 2013 288-292
- [3] Yamaoka M New York: Springer 2013 59-85
- [4] Jianping H, Tiefeng X, Hong L IEICE Transactions on Information and Systems 2005 E88-D (7) 1479-1485
- [5] Patatia B, Arora N, Singh B P Proceedings of International Conference on Emerging Trends in Networks and Computer Communications (ETNCC) 2011 244-247
- [6] Yangbo W, Jiaguo Zh, Jianping H Lecture Notes in Electrical Engineering 2011 87(2) 209-215

#### **5** Conclusions

A low-power SRAM circuit is designed with CPAL circuits. CPAL circuits work in a fully adiabatic manner. The application of near-threshold technique in the SRAM circuit reduces energy consumption. The design of SRAM circuit based on CPAL circuits is simply and easily realized because of its regular topology. DTCMOS technique is applied to the SRAM circuit to reduce leakage consumption. Width to length ratio of the storage cell is deduced with the driven circuits using AC power supply. The HSPICE simulation of the conventional CMOS SRAM, conventional CMOS SRAM with DTCMOS technique, CPAL SRAM, and CPAL SRAM with DTCMOS technique proves the advantage of the designed circuits.

#### Acknowledgments

This work was supported by the Key Program of National Natural Science of China (No. 61131001), and National Natural Science Foundation of China (No. 61271137).

- [7] Anderson B A, Bryant A, Clark Jr W F U.S. Patent 7937675 May 2011
- [8] Wenpin T, Shihwei W, Shihhsu H Proceedings of IEEE International
- Symposium on Circuits and Systems (ISCAS) 2012 349-352[9] Markovic D, Wang C C, Alarcon L P Proceedings of the IEEE 2010
- 98(2) 237-252 [10] Yangbo W, Jianping H Journal of Low Power Electronics 2011 7(3)
- 393-402 [11] Chen G, Sylvester D, Blaauw D. IEEE transaction on Very Large
- Scale Integration (VLSI) Systems 2010 18(11) 1590-1598 [12] Rabaey J M, Chandrakasan A, Nikolic B Prentice hall, 2002.Tallman
- D E, Wallace G G 1997 Synth. Met. 90 13Kroto H W, Fischer J E, Cox D E 1993 The Fullerenes Pergamon:Oxford

