Power_optimization_in_configurable_ALU_using_blend_of_techniques (1)
Power_optimization_in_configurable_ALU_using_blend_of_techniques (1)
Abstract—Reduction in power dissipation is an essential design consumption and phone getting hot and also area increases.
issue in VLSI circuit. One of the important block in any processor For reducing area, we are scaling down technology but we
is Arithmetic Logic Unit and it performs arithmetic and logical can see power dissipation more and chip may getting hot. For
operations. If operations are more and more complex then power
dissipation is more. The clock network is a major source of maintaining cool system, quick response and less area, we
power dissipation so we can reduce significant amount of power need to go for low power techniques.
if we can gate the clock whenever it isn’t required. From the
literature, we have noticed that there several methods/techniques
used to reduce the power within ALU, the used methods are
moderate and still there is scope to reduce power using blend
of techniques. So low power ALU is designed using clock gating
techniques besides using PIPO and Booth’s algorithm concept.
By giving specific opcode, we can enable the specific operation
and other operations are in inactive mode, so we can see less
power dissipation in ALU. Low power ALU is having two 8
bit input data with cin, bin, enable and 2 bit shift data and a
decoder 4:16 to select the 16 operations by giving 4 bit opcode
to it as a input with start enable function. At each iteration
the proposed design is implemented with one of these clock
gating techniques i.e latch free clock gated technique, latch based
clock gated technique, flipflop based clock gated technique, and
synthesis based clock gating technique with parallel in parallel
out (PIPO) shift registers. These all techniques are performed
with operation selection feature and PIPO shift registers in this
Fig. 1. Four dimensions to Optimize VLSI chip
design at different operating frequencies 100MHZ, 200MHZ,
400MHZ, 500MHZ and 1GHZ in Virtex-6. Virtex-6 FPGA board
having 40nm technology with 1 volt in Xilinx ISE 14.4 tool. This Power is the combination of static power dissipation and
paper mainly focuses to analyze the dynamic power dissipation
for various frequencies in ALU with and without clock gating dynamic power dissipation. Static power dissipation will occur
techniques combining with PIPO and Booth’s algorithm methods. when circuit in off condition. [1] As technology scales down,
static power dissipation becomes more and more important in
Index Terms—Dynamic power, Register Transfer Level, clock terms of leakage power. Leakage power due to drain induced
gating, PIPO. barrier lowering (DIBL), channel punch through, hot electron
effect, reverse bias source/drain junction leakages and etc.
I. I NTRODUCTION Dynamic power dissipation occurs when the charging and
Reduction in power dissipation is an essential design issue discharging occurs at output capacitance in particular node
in VLSI circuit. Few decades back designers mostly focus and also at operating frequency. We can see dynamic power
on area, delay and testability to optimize. While technology dissipation when signal transitions happens at that particular
scaling down, we can see more power leakage and dissipation node otherwise we can’t see dynamic power dissipation.[2]
in chip. In order to reduce power dissipation and leakage Switching activity is mainly responsible for dynamic power
power while scaling, we need to adopt the optimize techniques consumption which is related to clock signal.
like clock gating, voltage scaling etc. In our proposed work, we mainly focused on dynamic power
Now a days designers are focusing in four dimensions to build dissipation and it reduced by making less signal activities in
any application. Those are four dimensions are area, delay, proposed design. The clock network is a major source of
testability and power. For any application, consumers expect power dissipation so we can reduce significant amount of
light weight, early response and not getting hot. For example power if we can gate the clock whenever it isn’t required.
consumers expect mobile as light weight with multiple opera- By using operation selection, we can make active specific
tions and quick response. For multiple operations, we need to operation only and other operations are in inactive mode then
integrate multiple ICs into one chip. These causes more power we can’t see signal transitions in other operations so we can
Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
get less dynamic power dissipation when operation selection TABLE I
instruction used. SUMMARY OF VARIOUS CLOCK GATING TECHNIQUES
Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
flip flop based designs. By splitting flip flop, we can see two right shift, logical left and right shift, 1’s compliment and 2’s
latches from the master slave theorem. In this technique, we compliment. For shift operations, we designed to shift the bits
can see D flip flop with AND gate. upto three bits.
Here, we designed PIPO using low power D flip flop and this
flip flop designed by using low power D latch by the master
slave concept. Here, we are sending the data to operations
through PIPO. ALU designs are simulated using Xilinx 14.4
simulator and implemented on Virtex-6 and power analysis
is performed by using Xpower Analyzer. The detailed imple-
mentation results are reported in results section.
We mentioned the opcodes in Table II for enabling the
instruction and respective operation can be performed in the
design.
Fig. 5. Flip flop based clock gating technique
TABLE II
OPCODES FOR VARIOUS INSTRUCTIONS TO ACTIVE
Fig. 6. Synthesis based clock gating technique using negative latch
Active
Opcode Operation Functional
In synthesis based clock gating using negative latch, we Unit
can observe when enable signal constant then x signal can be 0000 Addition Arithmetic
one and due to this controlled clock can be high so negative 0001 subtraction Arithmetic
0010 Multiplication Arithmetic
latch won’t work and it gives previous value as output. When 0011 Division Arithmetic
enable signal changes then X value goes to zero value and 0100 AND operation Logical
controlled clock operated the negative latch circuit and it 0101 NAND operation Logical
provides different value at the output side. Gated clock signal 0110 OR operation Logical
0111 NOR operation Logical
can generated with AND gate of clock signal and output
1000 XOR operation Logical
signal of negative latch. 1001 XNOR operation Logical
Arithmetic Left
1010 Logical
shift operation
III. P ROPOSED A RCHITECTURE 1011
Arithmetic Right
Logical
shift operation
Proposed work presents 8-bit Arithmetic and Logic unit Logical Left
that performs various arithmetic and logical operations. Here, 1100 Logical
shift operation
arithmetic operations are addition, subtraction, multiplication Logical Right
1101 Logical
shift operation
and division. Multiplication and division operations are de-
1110 1’s compliment Logical
signed by using Booth’s algorithm and logical operations are 1111 2’s compliment Logical
AND, NAND, OR, NOR, XOR, XNOR, arithmetic left and
Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
IV. F LOW CHART FOR DESIGN FLOW
TABLE III
ALU WITHOUT CLOCK GATING
TABLE V
ALU WITH LATCH BASED CLOCK GATING
Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.
TABLE VI From the bar graph also, we can see power dissipation for
ALU WITH FLIPFLOP BASED CLOCK GATING various frequencies for each clock gating technique and in each
Static Dynamic Total Power frequency, we can see less power dissipation when synthesis
Frequency
(Hz)
Power dissi- Power dissi- dissipa- based clock gated technique used.
pation(mw) pation(mw) tion(mw)
100M 1293.70 42.92 1336.63 VI. C ONCLUSION
200M 1294.65 85.85 1380.50
400M 1296.57 171.70 1468.27 From the experimental results, we have observed that power
500M 1297.53 214.62 1512.15 consumption at RTL level increases proportional to clock
1G 1301.86 407.57 1709.43 frequency and switching activity. Using Booth’s algorithm in
multiplication and division optimized the circuitry required for
multiplication and divison. Low power PIPO and clock gating
techniques reduced significant amount of signal activity. By
TABLE VII
ALU WITH SYNTHESIS BASED CLOCK GATING ( USING NEGATIVE LATCH )
incorporating all these techniques into our design we got less
power compared to previous work [5] even after increasing
Frequency
Static Dynamic Total Power five more operations also. So from our study we can conclude
Power dissi- Power dissi- dissipa- that among all clock gating techniques, synthesis based clock
(Hz)
pation(mw) pation(mw) tion(mW)
100M 1293.70 42.85 1336.55 gated technique with PIPO and Booth’s algorithm yielding the
200M 1294.65 85.70 1380.35 low power.
400M 1296.56 171.40 1467.96
500M 1297.52 214.24 1511.76 R EFERENCES
1G 1301.84 406.89 1708.73 [1] B. Geetha, B. Padmavathi, and V. Perumal, “Design
methodologies and circuit optimization techniques for
low power cmos vlsi design,” in 2017 IEEE International
From the above tables, we can see less power dissipation in
Conference on Power, Control, Signals and Instrumenta-
ALU when clock gated technique is used. In among all clock
tion Engineering (ICPCSI), IEEE, 2017, pp. 1759–1763.
gating techniques, we have observed the less power by using
[2] B. Padmavathi, B. Geetha, and K. Bhuvaneshwari, “Low
synthesis based clock gating technique (using negative latch).
power design techniques and implementation strategies
Compared to previous work [5], in our design we used Booth’s
adopted in vlsi circuits,” in 2017 IEEE International Con-
algorithm for multiplication and division and low power PIPO.
ference on Power, Control, Signals and Instrumentation
Due to Booth’s algorithm used in multiplication and division,
Engineering (ICPCSI), IEEE, 2017, pp. 1764–1767.
we are getting less power because we are reducing number
[3] U. Kaur and R. Mehra, “Low power cmos counter using
of gates or partial products compare to binary multiplier and
clock gated flip-flop,” Int. J. Eng. Adv. Tech, vol. 2,
binary division. PIPO is designed using low power D flipflop
pp. 796–8, 2013.
and sending the data through it to combinational modules
[4] M. P. Dev, D. Baghel, B. Pandey, M. Pattanaik, and
instead of making all sequential modules. Sequential modules
A. Shukla, “Clock gated low power sequential circuit
are creating the more flipflops and it makes to power con-
design,” in 2013 IEEE Conference on Information &
sumption more. In our design, we added five more operations
Communication Technologies, IEEE, 2013, pp. 440–444.
than the previous work and even though we got less power
[5] G. Shrivastava and S. Singh, “Power optimization of
by the implementing of PIPO and Booth’s algorithm in the
sequential circuit based alu using gated clock & pulse
design.
enable logic,” in 2014 International Conference on Com-
putational Intelligence and Communication Networks,
IEEE, 2014, pp. 1006–1010.
[6] R. N. A. Shiny, B. Fahimunnisha, S. Akilandeswari, and
S. J. Venula, “Integration of clock gating and power
gating in digital circuits,” in 2019 5th international
conference on Advanced Computing & Communication
Systems (ICACCS), IEEE, 2019, pp. 704–707.
[7] N. Khanna and D. Mishra, “Clock gated 16-bits alu
design & implementation on fpga,” in 2018 4th Interna-
tional Conference for Convergence in Technology (I2CT),
IEEE, 2018, pp. 1–5.
Authorized licensed use limited to: Andhra University College of Engineering. Downloaded on February 01,2023 at 13:51:06 UTC from IEEE Xplore. Restrictions apply.