EC8095 - VLSI Design: Unit-Iv
EC8095 - VLSI Design: Unit-Iv
UNIT-IV
Presentation Outline
• Data-path Circuits
– Adders
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
– Multipliers
• Array Multiplier, Booth Multiplier
– Accumulators
• Barrel Shifters
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
INPUT-OUTPUT
MEM ORY
DATAPATH
A Generic Digital Processor
CONTROL
Building Blocks for Digital Architectures
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Arithmetic unit
- Bit-sliced datapath (adder , multiplier,
shifter, comparator, etc.)
Memory
- RAM, ROM, Buffers, Shift registers
Control
- Finite state machine (PLA, random logic.)
- Counters
Interconnect
- Switches
- Arbiters
- Bus
Bit-Sliced Design
•Datapaths are often arranged
in bit sliced organization.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
•Typical microprocessor
Bit 3
datapaths are 32 bit or 64 bit.
Data-Out
M ultiplexer
Bit 2
Data-In
Register
A dder
Shifter
Bit 1 •Those in DSL Modems,
Bit 0 Magnetic Disk drives, compact
disk players, are of arbitrary
width, typically 5 to 24 bits.
Tile identical processing elements
•The datapath consists of 32
bit slices (For eg: 32 bit µp),
each operating on a single bit-
Hence, the term bit slices.
Adder - Introduction
•Addition is the most common used arithmetic operation.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
•Optimization is done at
•Logic level
•Circuit level
Cin
A
Full
Sum
B
adder
Cout
Full-Adder
The Binary Adder
A B
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Sum
S = A B Ci
C i ,0 C o ,0 C o ,1 C o ,2 C o ,3
FA FA FA FA
(= C i,1 )
S0 S1 S2 S3
ta d d e r N – 1 tc a r r y + ts u m
• Inverting property
– Inverting all inputs to a full adder results in inverted values
for all outputs.
– Expressed as
Full Adder: Circuit Design
Considerations
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
A B A B
Ci FA Co Ci FA Co
S S
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Complimentary Static CMOS Full Adder
VDD
VDD
Ci A B
A B
A
B
Ci B VDD
A
X
Ci
Ci A S
Ci
A B B VDD
A B Ci A
Co B
28 Transistors
Complimentary Static CMOS Full Adder
• Complementary static CMOS adder design requires 28 transistors.
Consumes large area and circuit is slow;
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
– Tall pmos transistor stacks are present in both carry- and sum-
generation circuits.
– The Intrinsic load capacitance of the C0 signal is large and consists of
two diffusion and six gate capacitances, plus the wiring capacitance
A1 B1 A3 B3
A0 B0 A2 B2
S0 S2
S1 S3
VDD
VDD VDD A
A B B A B Ci B
Kill
"0"-Propagate A Ci
Co
Ci S
A Ci
"1"-Propagate Generate
A B B A B Ci A
24 transistors
The Mirror Adder
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
• The NMOS and PMOS chains are completely symmetrical. This guarantees
identical rising and falling transitions if the NMOS and PMOS devices are
properly sized. A maximum of two series transistors can be observed in the
carry-generation circuitry.
• When laying out the cell, the most critical issue is the minimization of the
capacitance at node Co. The reduction of the diffusion capacitances is
particularly important.
• The capacitance at node Co is composed of four diffusion capacitances, two
internal gate capacitances, and six gate capacitances in the connecting adder
cell .
• The transistors connected to Ci are placed closest to the output.
• Only the transistors in the carry stage have to be optimized for optimal speed.
All transistors in the sum stage can be minimal size.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Transmission Gate Full Adder
P
VDD
VDD Ci
A
P S Sum Generation
A A P Ci
A P VDD
B B
VDD A
P
P Co Carry Generation
Ci Ci Ci
A
Setup P
• The propagate signal, which is XOR of inputs A and B, is used to select the
true or complementary value of the input carry as the new sum output.
• Based on the propagate signal, the output carry is either set to the input
carry, or either one of inputs A or B.
• One interesting feature of this adder is, it has similar delays for both sum
and carry outputs.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Manchester Carry Chain
Manchester Carry Chain
• A manchester carry chain adder uses a cascade of pass transistors to
implement the carry chain. This is shown in above fig.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
• During the precharge phase (φ = 0), all intermediate nodes of the pass
transistor carry chain are precharged to VDD .
• During evaluation, the Ak node is discharged when there is an incoming
carry and the propagate sign Pk is high, or when the generate sign for stage
k(Gk ) is high.
• The worst case delay of the carry chain adder is modeled by the linearized
RC network in the following fig(next slide).
• Increasing the transistor width reduces the time constant, but it loads the
gates in the previous stage.
• Therefore transistor size is limited by the input loading capacitance.
• The distributed nature of RC of the carry chain results in a propagation
delay that is quadratic in the number of bits N.
• To avoid this, it is necessary to insert signal buffering inverters.
• Adding inverter makes the overall propagation delay a linear function of N,
as is the case with ripple carry adders.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016 Manchester Carry Chain
N i N ( N 1)
t p 0.69 Ci R j 0.69
RC
i 1 j 1 2
when all Ci C & R j R
ADDER: Logic Design Considerations
Carry-Bypass Adder
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
P0 G1 P0 G1 P2 G2 P3 G3
P0 G1 P0 G1 P2 G2 P3 G3
BP=P oP1 P2 P3
Ci,0 C o ,0 Co,1 C o,2
Multiplexer
FA FA FA FA
Co,3
BP
• Fig shows the possible carry propagation paths when the full-adder
circuit is implemented in Manchester carry style. This kind of
arrangements speeds up addition.
•In both the cases, the delay is smaller than the normal ripple
configuration.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Carry-Bypass Adder
N
t p t setup M tcarry 1 tbypass ( M 1) tcarry t sum
M
• tsetup : the fixed overhead time to create the generate and propagate signals
• tcarry : the propagation delay through a single bit. The worst case carry –
propagation delay through a single stage of M bits is approximately M
times larger.
• tbypass : the propagation delay through the bypass multiplexer of a single
stage
• tsum : the time to generate the sum of the final stage.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Carry Ripple versus Carry Bypass
tp
ripple adder
bypass adder
4..8
N
Carry-Select Adder
Setup
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
P,G
Carry Vector
Sum Generation
FDP on VLSI Design, SSNCE, Jan 4-8, 2016 Carry-Select Adder
(1)
• Assume that an N-bit adder contains P stages, and the first stage adds M
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
P( P 1) P 2 1
MP P M
2 2 2
• If M << N (eg., M =2 and N = 64), the first term dominates, and above
equation can be simplified to
P2
N or P 2N
2
Square Root Carry Select
equation
N
t add t setup Mtcarry t mux t sum
M
t add t setup Mtcarry ( 2 N ) t mux t sum
• The delay is proportional to is square root of N for large adders (N >>>M)
• It is observed that for large values of N, tadd almost becomes constant
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Adder Delays - Comparison
50.0
ripple adder
40.0
30.0
tp
linear select
20.0
0.0
0.0 20.0 40.0 60.0
N
CARRY LOOK AHEAD ADDER
LookAhead - Basic Idea
A0 ,B 0 A1 ,B 1 AN-1 ,BN-1
...
Ci,0 P0 Ci,1 P1
FDP on
Ci,N-1 PN-1
...
FDP on VLSI Design, SSNCE, Jan 4-8, 2016 Look Ahead - Basic Idea
• Carry look ahead logic uses the concepts of generating and propagating
carries.
• A carry-lookahead adder improves speed by reducing the amount of time
required to determine carry bits.
• The carry-lookahead adder calculates one or more carry bits before the sum,
which reduces the wait time to calculate the result of the larger value bits.
The Kogge-stone adder and Brent-kung adder are examples of this type of
adder.
• Carry lookahead depends on two things:
- Calculating, for each digit position, whether that position is going to
propagate a carry if one comes in from the right.
- Combining these calculated values to be able to deduce quickly whether,
for each group of digits, that group is going to propagate a carry that comes
in from the right.
FDP on VLSI Design, SSNCE, Jan 4-8, 2016 Look Ahead - Basic Idea
• Supposing that groups of 4 digits are chosen. Then the sequence of events
goes something like this:
-All 1-bit adders calculate their results. Simultaneously, the
lookahead units perform their calculations.
-Suppose that a carry arises in a particular group. Within at most 5
gate delays, that carry will emerge at the left-hand end of the group and
start propagating through the group to its left.
-If that carry is going to propagate all the way through the next group,
the lookahead unit will already have deduced this. Accordingly, before the
carry emerges from the next group the lookahead unit is immediately
(within 1 gate delay) able to tell the next group to the left that it is going to
receive a carry - and, at the same time, to tell the next lookahead unit to the
left that a carry is on its way.
Carry-Look-Ahead Adders
• Objective - generate all incoming carries in parallel
FDP on VLSI Design, SSNCE, Jan 4-8, 2016c
G3
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
G2
G1
G0
C i,0
C o,3
P0
P1
P2
P3
KOGGE STONE ADDER
FDP on VLSI Design, SSNCE, Jan 4-8, 2016 Brent-Kung Adder
• Number of logic levels = log2N
• Gate fan-in is reduced
• Fan-out can be large, but is handled by careful buffering
• Regular, compact layout; forward tree and reverse tree fit
together perfectly
• Once carry bits are available, sum bits are easily derived in
constant time
• Lookahead adders are 100% larger than ripple carry adders,
but yield dramatic speed advantages for large adders
• Logarithmic behavior makes it preferable over bypass or select
adders
The Binary Multiplication
M + N– 1
·· Y k
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Z X= =
Zk 2
M –1
k=0 i
M – 1 N – 1
X =
Xi 2
i j i=0
= Xi 2 Yj 2
with N–1
j
i=0 j = 0 Y =
Y j2
M – 1 N – 1 j= 0
i + j
=
Xi Yj 2
i =0 j= 0
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
Partial Product Generation
• Logical AND of multiplicand X and multiplier bit Yi
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Y0
X3 X2 X1 X0 Y1 Z 0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
HA FA FA HA
FA FA FA HA Critical Path 1
Critical Path 2
HA FA FA FA
HA FA FA FA
HA FA FA HA
Extra set of adders
Usually fast carry look ahead adder
Vector Merging Adder
6 5 4 3 2 1 0 6 5 4 3 2 1 0 Bit position
(a) (b)
FA HA
(c) (d)
First stage
HA HA
HA
Second stage FA FA FA FA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
Wallace-Tree Multiplier
x3y2 x2y2 x3y1 x1y2 x3y0 x1y1 x2y0 x0y1
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
Partial products x3y3 x2y3 x1y3 x0y3 x2y1 x0y2 x1y0 x0y0
First stage
HA HA
Second stage FA FA FA FA
HA
Final adder
z7 z6 z5 z4 z3 z2 z1 z0
Ai Bi
Ai-1 Bi-1
Bit-Slice i
...
• Too slow for large shift values
The Barrel Shifter
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
wire
Logarithmic Shifter
•Total shift decomposed into powers
FDP on VLSI Design, SSNCE, Jan 4-8, 2016
of two
Sh1 Sh1 Sh2 Sh2 Sh4 Sh4
•Max shift width of M has log2M
stages
A3 B3
Thank You