Adders
Adders
1
Fall 2008 EE 5323 - VLSI Design I - © Kia Bazargan 2
Introduction
• Arithmetic unit
Bit sliced data path – adder, multiplier, shifter,
comparator, etc.
• Memory
RAM, ROM, buffers, shift registers
• Control
Finite state machine (PLA, random logic)
Counters
• Interconnect
Switches, arbiters, bus
17: Adders 4
Motivation
• Arithmetic units are, among others, core of every data path and
addressing unit.
• Data path is at the core of
microprocessors (CPU)
signal processors (DSP)
data processing application specific IC’s (ASIC) and
programmable IC’s (FPGA)
• Standard arithmetic units available from libraries
• Design of arithmetic units necessary for
non-standard operations
high performance components
library development
17: Adders 5
Why Adders?
• Addition: a fundamental operation
Basic block of most arithmetic operations
Address calculation
• Faster, faster and faster
• How?
Architectural level optimization
Gate-level optimization
Speed/area trade-off
Fall 2008 6
• Single-bit Addition
• Carry-Ripple Adder
• Carry-Skip Adder
• Carry-Lookahead Adder
• Carry-Select Adder
• Carry-Increment Adder
• Tree Adder etc
7
Adding Two One-bit Operands
• One-bit Half Adder:
A B Sum Cout
A B 0 0 0 0
Sum = A B 0 1 1 0
1 0 1 0
Cout HA
Cout = A.B 1 1 0 1
Sum
Cin A B Sum Cout
• One-bit Full Adder: 0 0 0 0 0
A B 0 0 1 1 0
0 1 0 1 0
Sum = A B Cin
0 1 1 0 1
Cout FA Cin 1 0 0 1 0
Cout = A.B + B.Cin 1 0 1 0 1
+ A.Cin 1 1 0 0 1
Sum 1 1 1 1 1
Cn FA ... FA FA FA C0
Sn-1 S2 S1 S0
Su
Fall 2008 EE 5323 - VLSI Design I - © Kia Bazargan m 9
4-bit Ripple Carry Addition: Example
A=0011 0 0 0 1 1 0 1 1
B=0101 A3 B3 A2 B2 A1 B1 A0 B0
C4 C3 C2 C1
FA FA FA FA C0 0
S3 S2 S1 S0
T=0 0 0 0 0 0 0 0 0 S=0000
T=1 0 0 0 1 0 1 1 0 S=0110
T=2 0 0 0 1 1 0 1 0 S=0100
T=3 0 0 1 0 1 0 1 0 S=0000
T=4 0 1 1 0 1 0 1 0 S=1000
a i b i
g i ai bi pi ai bi
0
c out
s 1 c in
si pi ci
si
ci 1 ai bi ci ai bi ci ai bi g i pi ci
ci 2 g i 1 pi 1ci 1 g i 1 pi 1 ( g i pi c1 )
g i 1 pi 1 g i pi 1 pi c1
ci 3 g i 2 pi 2 ci 2 g i 2 pi 2 ( g i 1 pi 1 g i pi 1 pi ci )
g i 2 pi 2 g i 1 pi 2 pi 1 g i pi 2 pi 1 pi ci
ci 4 g i 3 pi 3ci 3 g i 3 pi 3 ( g i 2 pi 2 g i 1 pi 2 pi 1 g i )
g i 3 pi 3 g i 2 pi 3 pi 2 g i 1 pi 3 pi 2 pi 1 g i pi 3 pi 2 pi 1 pi ci
Gj Pj
Oklobdzija 2004 Computer Arithmetic 13
Carry-Lookahead Adder
G j g i 3 pi 3 g i 2 pi 3 pi 2 g i 1 pi 3 pi 2 pi 1 g i
Pj pi 3 pi 2 pi 1 pi
a i+ 3 b i+ 3
a i+ 2 b i+ 2
a i+ 1 b i+ 1
a i b i
to calculate p, g g i+ 1p i+ 1 g i+ 1 p i+ 1 g i+ 1
p i+ 1 g i p i
G j
P j
Implement
Implement CC44 as
as aa one-stage
one-stage CMOS
CMOS logic
logic
large
large delay
delay
p,g p,g p,g p,g p,g p,g p,g p,g p,g p,g p,g p,g
0
C0
Carry Generator Carry Generator Carry Generator
C12 C8 C4
S11 S10 S9 S8 S7 S6 S5 S4 S3 S2 S1 S0
17: Adders 18
Carry-Skip Adder
17: Adders 19
Carry-Skip Adder
17: Adders 20
Serial adder
• May be used in signal-processing arithmetic
where fast computation is important but latency
is unimportant.
• Data format (LSB first):
0 1 1 0
LSB
Serial adder structure
LSB control signal clears the carry shift register:
Lecture 20: Multiplier Design
Review: Basic Building Blocks
• Datapath
Execution units
o Adder, multiplier, divider, shifter, etc.
Register file and pipeline registers
Multiplexers, decoders
• Control
Finite state machines (PLA, ROM, random logic)
• Interconnect
Switches, arbiters, buses
• Memory
Caches (SRAMs), TLBs, DRAMs, buffers
The Binary Multiplication
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
Multiply Operation
• Multiplication is just a a lot of additions
N
multiplican
d
multiplier
partial
N product can be formed in parallel
array
2N
Multiplication Approaches
• Right shift and add
Partial product array rows are accumulated from top to bottom on
an N-bit adder
o After each addition, right shift (by one bit) the accumulated partial product to
align it with the next row to add
Time for N bits Tserial_mult = O(N Tadder) = O(N2) for a RCA
Making it faster
Use a faster adder
Use higher radix (e.g., base 4) multiplication – O(N/2 T adder)
- Use multiplier recoding to simplify multiple formation (booth)
Form the partial product array in parallel and add it in parallel
X3 X2 X1 X0 Y1
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
Booth multiplier
• Encoding scheme to reduce number of stages in
multiplication.
• Performs two bits of multiplication at once—
requires half the stages.
• Each stage is slightly more complex than simple
multiplier, but adder/subtracter is almost as
small/fast as adder.
Booth encoding
• Two’s-complement form of multiplier:
y = -2nyn + 2n-1yn-1 + 2n-2yn-2 + ... (first bit is the sign
bit)
(example, y=18=010010 y= -18 = 101110 )
• Rewrite using 2a = 2a+1 - 2a:
y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...
• Consider first two terms: by looking at three bits
of y, we can determine whether to add x, 2x to
partial product.
Booth actions
y = 2n(yn-1-yn) + 2n-1(yn-2 -yn-1) + 2n-2(yn-3 -yn-2) + ...
(a) (b)
FA HA
(c) (d)
Wallace-Tree Multiplier
First stage
HA HA
Second stage FA FA FA FA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
partial
product mux
interconnect
array +
reduction
tree
reduction
tree (log
N)
fast carry
propagate +
adder (CPA) CPA (log
P (product) N)
(4,2) Counter
• Built out of two (3,2) counters (just FA’s!)
all of the inputs (4 external plus one internal) have the
same weight (i.e., are in the same bit position)
the internal carry output is fed to the next higher weight
position (indicated by the )
(3,2)
partial
multiplie
product
r
array
multiplie
r
nine (4,2)
counters
nine (4,2)
counters
thirteen (4,2) counters
13-bit
CPA
Multipliers —Summary