0% found this document useful (0 votes)
15 views

Power_optimization_in_configurable_ALU_using_blend_of_techniques (1)

Power optimised in confugurable ALU

Uploaded by

sakethvarma239
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views

Power_optimization_in_configurable_ALU_using_blend_of_techniques (1)

Power optimised in confugurable ALU

Uploaded by

sakethvarma239
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

2021 2nd International Conference for Emerging Technology (INCET)

Belgaum, India. May 21-23, 2021

Power optimization in configurable ALU using


blend of techniques
Mallikarjuna Vatte
Department of Electronics, NIT Surat
2021 2nd International Conference for Emerging Technology (INCET) | 978-1-7281-7029-9/20/$31.00 ©2021 IEEE | DOI: 10.1109/INCET51464.2021.9456346

[email protected]

Abstract—Reduction in power dissipation is an essential design consumption and phone getting hot and also area increases.
issue in VLSI circuit. One of the important block in any processor For reducing area, we are scaling down technology but we
is Arithmetic Logic Unit and it performs arithmetic and logical can see power dissipation more and chip may getting hot. For
operations. If operations are more and more complex then power
dissipation is more. The clock network is a major source of maintaining cool system, quick response and less area, we
power dissipation so we can reduce significant amount of power need to go for low power techniques.
if we can gate the clock whenever it isn’t required. From the
literature, we have noticed that there several methods/techniques
used to reduce the power within ALU, the used methods are
moderate and still there is scope to reduce power using blend
of techniques. So low power ALU is designed using clock gating
techniques besides using PIPO and Booth’s algorithm concept.
By giving specific opcode, we can enable the specific operation
and other operations are in inactive mode, so we can see less
power dissipation in ALU. Low power ALU is having two 8
bit input data with cin, bin, enable and 2 bit shift data and a
decoder 4:16 to select the 16 operations by giving 4 bit opcode
to it as a input with start enable function. At each iteration
the proposed design is implemented with one of these clock
gating techniques i.e latch free clock gated technique, latch based
clock gated technique, flipflop based clock gated technique, and
synthesis based clock gating technique with parallel in parallel
out (PIPO) shift registers. These all techniques are performed
with operation selection feature and PIPO shift registers in this
Fig. 1. Four dimensions to Optimize VLSI chip
design at different operating frequencies 100MHZ, 200MHZ,
400MHZ, 500MHZ and 1GHZ in Virtex-6. Virtex-6 FPGA board
having 40nm technology with 1 volt in Xilinx ISE 14.4 tool. This Power is the combination of static power dissipation and
paper mainly focuses to analyze the dynamic power dissipation
for various frequencies in ALU with and without clock gating dynamic power dissipation. Static power dissipation will occur
techniques combining with PIPO and Booth’s algorithm methods. when circuit in off condition. [1] As technology scales down,
static power dissipation becomes more and more important in
Index Terms—Dynamic power, Register Transfer Level, clock terms of leakage power. Leakage power due to drain induced
gating, PIPO. barrier lowering (DIBL), channel punch through, hot electron
effect, reverse bias source/drain junction leakages and etc.
I. I NTRODUCTION Dynamic power dissipation occurs when the charging and
Reduction in power dissipation is an essential design issue discharging occurs at output capacitance in particular node
in VLSI circuit. Few decades back designers mostly focus and also at operating frequency. We can see dynamic power
on area, delay and testability to optimize. While technology dissipation when signal transitions happens at that particular
scaling down, we can see more power leakage and dissipation node otherwise we can’t see dynamic power dissipation.[2]
in chip. In order to reduce power dissipation and leakage Switching activity is mainly responsible for dynamic power
power while scaling, we need to adopt the optimize techniques consumption which is related to clock signal.
like clock gating, voltage scaling etc. In our proposed work, we mainly focused on dynamic power
Now a days designers are focusing in four dimensions to build dissipation and it reduced by making less signal activities in
any application. Those are four dimensions are area, delay, proposed design. The clock network is a major source of
testability and power. For any application, consumers expect power dissipation so we can reduce significant amount of
light weight, early response and not getting hot. For example power if we can gate the clock whenever it isn’t required.
consumers expect mobile as light weight with multiple opera- By using operation selection, we can make active specific
tions and quick response. For multiple operations, we need to operation only and other operations are in inactive mode then
integrate multiple ICs into one chip. These causes more power we can’t see signal transitions in other operations so we can

978-1-7281-7029-9/21/$31.00 ©2021 IEEE 1

Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
get less dynamic power dissipation when operation selection TABLE I
instruction used. SUMMARY OF VARIOUS CLOCK GATING TECHNIQUES

II. R ELATED W ORK Techniques of Basic Building


Advantage Disadvantages
clock gating Block
Dynamic power reduction can be done using different Simple gated
Latch free Basic AND gate Glitches
techniques at different level. According to reference [3], clock logic
Level Sensitive
gating technique is used for D latch. When D latch input and Latch based
latch with AND
Glitch free, easy
Not testable
output are different then only D latch can be active otherwise clock gating to implement
gate
D latch output in inactive mode and output maintains previous Flip flop with
Testability is
value. Flip Flop based easily More power
AND gate
implemented
minimum power
combination of
Synthesis based consumption Area constraints
gates with latch
among all

A. Latch Free clock gating technique:

In this technique, single two input basic gates are used


as AND or NOR for clock gating technique.

Fig. 2. Low power D latch

According to reference [4], we analyzed the clock gated D


flip flop and they designed 16 bit register and they observed
dynamic power dissipation for various frequencies on virtex-6
FPGA. According to reference [5], they focused on dynamic
power reduction with two techniques. Two techniques are Fig. 3. Latch free clock gating technique
flip flop based clock gated ALU module and negative latch
based clock gated ALU module. These two techniques The main drawback of this technique is that if enable
mostly restricted clock signal transitions so it gives less signal is non functioning with the clock then we can see
power. In ALU, they operated AND, OR, NAND, NOR, glitches in gated clock signal.
XNOR, 1’s compliment, 2’s compliment, addition, subtractor,
multiplication and division operations. They observed less B.latch based clock gating technique:
dynamic power dissipation in negative latch based clock
gated ALU compare to flip flop based clock gated ALU. It is implemented to solve glitches in latch-free based
design because of non functioning of enable signal. Here the
According to reference [6], they focused to reduce to enable signal is controlled by the latch.
dynamic and static power dissipation. For reducing dynamic
power dissipation, D flipflop based clock gating technique
used and for static power dissipation, leakage control
transistor (LECTOR) technique used in AND gate after D
flipflop in clock gating technique. Clock gating and LECTOR
techniques used for PIPO in this paper.

According to reference [7], they mainly focused on


dynamic power dissipation by using various clock gating
strategies. Those are latch free clock gating technique, latch Fig. 4. Latch based clock gating technique
based clock gating technique, flip flop based clock gating
technique and synthesis based clock gating techniques in If it is positive latch, the enable is latched in the high half
16 bit ALU. In ALU, they operated addition, subtraction, cycle of the system clock. Gated clock becomes high when
increment, decrement, NOR, XOR, XNOR, AND, NAND, system clock and latch output are high otherwise it goes to
OR, compliment, shift left and shift right operations and zero. That means sleep period of system clock cannot be
they analyzed dynamic power for various frequencies for generate the gated clock.
each clock gated technique also. They observed less dynamic
power dissipation when they used synthesis based clock gated C. Flip flop based clock gating technique:
technique with ALU. In many applications, latch based designs are moved to

Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
flip flop based designs. By splitting flip flop, we can see two right shift, logical left and right shift, 1’s compliment and 2’s
latches from the master slave theorem. In this technique, we compliment. For shift operations, we designed to shift the bits
can see D flip flop with AND gate. upto three bits.
Here, we designed PIPO using low power D flip flop and this
flip flop designed by using low power D latch by the master
slave concept. Here, we are sending the data to operations
through PIPO. ALU designs are simulated using Xilinx 14.4
simulator and implemented on Virtex-6 and power analysis
is performed by using Xpower Analyzer. The detailed imple-
mentation results are reported in results section.
We mentioned the opcodes in Table II for enabling the
instruction and respective operation can be performed in the
design.
Fig. 5. Flip flop based clock gating technique

From the above figure, gated clock goes to high when


flip flop output and clock are in high state otherwise gated
clock goes to zero state. That means when clock in sleep
mode then gated clock also in zero state.

D. Synthesis based clock gating technique:

In this technique, gated clock can be generated by using


either a positive or a negative latch with combination of AND
or OR or EXOR or EXNOR logic gates shown in fig.6.

Fig. 7. ALU with clock gating and operation selection

TABLE II
OPCODES FOR VARIOUS INSTRUCTIONS TO ACTIVE
Fig. 6. Synthesis based clock gating technique using negative latch
Active
Opcode Operation Functional
In synthesis based clock gating using negative latch, we Unit
can observe when enable signal constant then x signal can be 0000 Addition Arithmetic
one and due to this controlled clock can be high so negative 0001 subtraction Arithmetic
0010 Multiplication Arithmetic
latch won’t work and it gives previous value as output. When 0011 Division Arithmetic
enable signal changes then X value goes to zero value and 0100 AND operation Logical
controlled clock operated the negative latch circuit and it 0101 NAND operation Logical
provides different value at the output side. Gated clock signal 0110 OR operation Logical
0111 NOR operation Logical
can generated with AND gate of clock signal and output
1000 XOR operation Logical
signal of negative latch. 1001 XNOR operation Logical
Arithmetic Left
1010 Logical
shift operation
III. P ROPOSED A RCHITECTURE 1011
Arithmetic Right
Logical
shift operation
Proposed work presents 8-bit Arithmetic and Logic unit Logical Left
that performs various arithmetic and logical operations. Here, 1100 Logical
shift operation
arithmetic operations are addition, subtraction, multiplication Logical Right
1101 Logical
shift operation
and division. Multiplication and division operations are de-
1110 1’s compliment Logical
signed by using Booth’s algorithm and logical operations are 1111 2’s compliment Logical
AND, NAND, OR, NOR, XOR, XNOR, arithmetic left and

Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
IV. F LOW CHART FOR DESIGN FLOW

Fig. 10. ALU without clock gating

The above simulations are performed for various frequen-


cies on Virtex-6 and calculated power dissipation of ALU
using Xpower analyzer. These results are shown in below
tables.

TABLE III
ALU WITHOUT CLOCK GATING

Static Dynamic Total Power


Frequency
Power dissi- Power dissi- dissipa-
(Hz)
pation(mw) pation(mw) tion(mw)
100M 1293.78 46.32 1340.10
200M 1294.81 92.64 1387.45
400M 1296.87 185.28 1482.15
500M 1297.91 231.60 1529.51
1G 1302.71 445.54 1748.25

Fig. 8. Design Flow


TABLE IV
ALU WITH LATCH FREE CLOCK GATING

Static Dynamic Total Power


V. S IMULATION RESULTS AND D ISCUSSION Frequency
Power dissi- Power dissi- dissipa-
(Hz)
pation(mw) pation(mw) tion(mw)
100M 1293.70 42.88 1336.88
200M 1294.65 85.76 1380.42
The ALU is designed with and without clock gating 400M 1296.56 171.53 1468.09
techniques. These designs are implemented in verilog and 500M 1297.52 214.41 1511.93
1G 1301.85 407.36 1709.21
simulated using Xilinx ISE 14.4 simulator. The simulated
waveforms of various strategies of ALU shown in figure.9
and figure.10.

TABLE V
ALU WITH LATCH BASED CLOCK GATING

Static Dynamic Total Power


Frequency
Power dissi- Power dissi- dissipa-
(Hz)
pation(mw) pation(mw) tion(mw)
100M 1293.77 45.84 1339.61
200M 1294.78 91.68 1386.47
400M 1296.83 183.36 1480.19
500M 1297.85 229.20 1527.06
1G 1302.53 437.70 1740.23
Fig. 9. ALU with clock gating technique

Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
TABLE VI From the bar graph also, we can see power dissipation for
ALU WITH FLIPFLOP BASED CLOCK GATING various frequencies for each clock gating technique and in each
Static Dynamic Total Power frequency, we can see less power dissipation when synthesis
Frequency
(Hz)
Power dissi- Power dissi- dissipa- based clock gated technique used.
pation(mw) pation(mw) tion(mw)
100M 1293.70 42.92 1336.63 VI. C ONCLUSION
200M 1294.65 85.85 1380.50
400M 1296.57 171.70 1468.27 From the experimental results, we have observed that power
500M 1297.53 214.62 1512.15 consumption at RTL level increases proportional to clock
1G 1301.86 407.57 1709.43 frequency and switching activity. Using Booth’s algorithm in
multiplication and division optimized the circuitry required for
multiplication and divison. Low power PIPO and clock gating
techniques reduced significant amount of signal activity. By
TABLE VII
ALU WITH SYNTHESIS BASED CLOCK GATING ( USING NEGATIVE LATCH )
incorporating all these techniques into our design we got less
power compared to previous work [5] even after increasing
Frequency
Static Dynamic Total Power five more operations also. So from our study we can conclude
Power dissi- Power dissi- dissipa- that among all clock gating techniques, synthesis based clock
(Hz)
pation(mw) pation(mw) tion(mW)
100M 1293.70 42.85 1336.55 gated technique with PIPO and Booth’s algorithm yielding the
200M 1294.65 85.70 1380.35 low power.
400M 1296.56 171.40 1467.96
500M 1297.52 214.24 1511.76 R EFERENCES
1G 1301.84 406.89 1708.73 [1] B. Geetha, B. Padmavathi, and V. Perumal, “Design
methodologies and circuit optimization techniques for
low power cmos vlsi design,” in 2017 IEEE International
From the above tables, we can see less power dissipation in
Conference on Power, Control, Signals and Instrumenta-
ALU when clock gated technique is used. In among all clock
tion Engineering (ICPCSI), IEEE, 2017, pp. 1759–1763.
gating techniques, we have observed the less power by using
[2] B. Padmavathi, B. Geetha, and K. Bhuvaneshwari, “Low
synthesis based clock gating technique (using negative latch).
power design techniques and implementation strategies
Compared to previous work [5], in our design we used Booth’s
adopted in vlsi circuits,” in 2017 IEEE International Con-
algorithm for multiplication and division and low power PIPO.
ference on Power, Control, Signals and Instrumentation
Due to Booth’s algorithm used in multiplication and division,
Engineering (ICPCSI), IEEE, 2017, pp. 1764–1767.
we are getting less power because we are reducing number
[3] U. Kaur and R. Mehra, “Low power cmos counter using
of gates or partial products compare to binary multiplier and
clock gated flip-flop,” Int. J. Eng. Adv. Tech, vol. 2,
binary division. PIPO is designed using low power D flipflop
pp. 796–8, 2013.
and sending the data through it to combinational modules
[4] M. P. Dev, D. Baghel, B. Pandey, M. Pattanaik, and
instead of making all sequential modules. Sequential modules
A. Shukla, “Clock gated low power sequential circuit
are creating the more flipflops and it makes to power con-
design,” in 2013 IEEE Conference on Information &
sumption more. In our design, we added five more operations
Communication Technologies, IEEE, 2013, pp. 440–444.
than the previous work and even though we got less power
[5] G. Shrivastava and S. Singh, “Power optimization of
by the implementing of PIPO and Booth’s algorithm in the
sequential circuit based alu using gated clock & pulse
design.
enable logic,” in 2014 International Conference on Com-
putational Intelligence and Communication Networks,
IEEE, 2014, pp. 1006–1010.
[6] R. N. A. Shiny, B. Fahimunnisha, S. Akilandeswari, and
S. J. Venula, “Integration of clock gating and power
gating in digital circuits,” in 2019 5th international
conference on Advanced Computing & Communication
Systems (ICACCS), IEEE, 2019, pp. 704–707.
[7] N. Khanna and D. Mishra, “Clock gated 16-bits alu
design & implementation on fpga,” in 2018 4th Interna-
tional Conference for Convergence in Technology (I2CT),
IEEE, 2018, pp. 1–5.

Fig. 11. Bar graph showing configurable ALU power vs frequency

Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.

You might also like