Unit 4-Vlsi and Chip Design
Unit 4-Vlsi and Chip Design
UNIT – IV
INTERCONNECT, MEMORY ARCHITECTURE AND ARITHMETICCIRCUITS
Data path circuits are meant for passing the data from one segment to other segment for
processing or storing.
The datapath is the core of processors, where all computations are performed.
It is generally defined with general digital processor. It is shown in figure.
In this, data is applied at one port and data output is obtained at second port.
Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.
Datapaths are arranged in a bit sliced organization.
Instead of operating on single bit digital signals, the data in a processor are arranged in a
word based fashion.
Bit slices are either identical or resemble a similar structure for all bits.
1
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
The datapath consists of the number of bit slices (equal to the word length), each
operating on a single bit. Hence the term is bit-sliced.
Draw the structure of ripple carry adder and explain its operation. (Nov 2017)
Explain the operation of a basic 4 bit adder. (Nov 2016)
Realize a 1-bit adder using static CMOS logic. Optimize the Boolean expressions of
sum and carryout and realize a 1-bit adder using static CMOS logic. Also realize a 1-
bit adder using transmission gate. Compare all the three cases from hardware
perspective. (Nov 2019)
2
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Ci= Carry bit from the previous column.
N bit ripple carry adder needs n full adders with Ci+1 carry out bit.
3
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
In ripple carry adder, carry bit is calculated along with the sum bit. Each bit must wait for
calculation of previous carry.
VDD
VDD
Ci A B
A B
A
B
Ci B
VDD
A
X
Ci
Ci A S
Ci
A B B VDD
A B Ci A
Co B
Explain the operation and design of Carry lookahead adder (CLA). (May 2017, Nov
2016)[Apr/May 2022] [Nov/Dec 2022]
How the drawback in ripple carry adder overcome by carry look ahead adder and
discuss. (Nov 2017)
Explain the concept of carry lookahead adder and discuss its types. (April 2018)
Derive the necessary expressions of a 4 bit carry look ahead adder and realize the carry
out expressions using dynamic CMOS logic. (April 2019-13M)
4
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Combining these calculated values to be able to realize quickly whether, for each group
of digits, that group is going to propagate a carry.
Theory of operation:
Carry lookahead logic uses the concept of generating and propagating carry.
The addition of two 1-digit inputs A and B is said to generate if the addition will carry,
regardless of whether there is an input carry.
Generate:
In binary addition, A + B generates if and only if both A and B are 1.
If we write G(A,B) to represent the binary predicate that is true if and only if A + B
generates, we have:
G(A,B) = A . B
Propagate:
The addition of two 1-digit inputs A and B is said to propagate if the addition will carry
whenever there is an input carry.
In binary addition, A + B propagates if and only if at least one of A or B is 1.
If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:
P( A, B) A B
These adders are used to overcome the latency which is introduced by the rippling effect of
carry bits.
Write carry look-ahead expressions in terms of the generate gi and propagate pi signals. The
general form of carry signal ci thus becomes
ci 1 ai .bi ci .(ai bi ) gi ci .pi
If ai .b =1, then ci 1 1, write generate term as, gi ai .bi
Write the propagate term as, pi ai bi
Sum and carry expression are written as,
Si = ai bi
c1=g0+p0.c0
c2=g1+p1.c1= g1+p1.(g0+p0.c0)
c3=g2+p2.c2
c4=g3+p3.c3 =g3+p3.g2+ p3.p2.g1+ p3.p2.p1.g0 + p3.p2.p1.p0.c0
5
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure:Symbol and truth table of generate & propagate
6
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
******************************************************************************
7
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
The Manchester carry chain is a variation of the carry-lookaheadadder that uses shared logic
to lower the transistor count.
A Manchester carry chain generates the intermediate carries by tapping off nodes in the gate
that calculates the most significant carry value.
Dynamic logic can support shared logic, as transmission gate logic.
One of the major drawbacksof the Manchester carry chain is increase the propagation delay.
A Manchester-carry-chain section generally won't exceed 4 bits.
In this adder, the basic equation is ci 1 gi ci .pi
Where pi ai bi and gi ai .bi
Carry kill bit ki ai bi = ai .bi
If Ki=1, then pi=0 and gi=0. Hence, ki is known as carry kill bit.
Table
8
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
******************************************************************************
4.4.1. HIGH SPEED ADDERS:
9
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Design a carry bypass adder and discuss its features. (May 2016)
Explain the carry-propagate adder and show how the generation and propagation
signals are framed. [May 2021]
It is high speed adder. It consist of adder, AND gate and OR gate.
An incoming carry Ci,0=1 propagates through the complete adder chain and an outgoing
carry C0,3=1.
In other words, if (P0P1P2P3 =1) then C0,3= Ci,0 else either DELETE or GENERATE
occurred.
It can be used to speed up the operation of the adder, as shown in below fig (b).
10
10
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup tsetup Setup Setup Setup
tbypass
M bits
Design a carry select adder and discuss its features. (May 2016)
11
11
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Two 4-bit ripple carry adders are multiplexed together, where the resulting carry and sum
bits are selected by the carry-in.
12
12
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
13
13
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
14
14
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
4.6. MULTIPLIERS:
Explain the design and operation of 4 x 4 multiplier circuit. (Apr. 2016, 2017, Nov 2016, 2018)
Design a multiplier for 5 bit by 3 bit. Explain its operation and summarize the numbers of
adders. Discuss it over Wallace multiplier. (Nov 2017, April 2018)
Design a 4 bit unsigned array multiplier and analyze its hardware complexity. (April 2019-
13M) (Nov 2019)
Describe the hardware architecture of a 4-bit signed array multiplier. [Nov/Dec 2022]
15
15
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
A study of computer arithmetic processes will reveal that the most common requirements
are for addition and subtraction.
There is also a significant need for a multiplication capability.
Basic operations in multiplication are given below.
0x0=0, 0x1=0, 1x0=0, 1x1=1
1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0
0 0 0 0 0 0 Partial products
+ 1 0 1 0 1 0
1 1 1 0 0 1 1 1 0 Result
If two different 4-bit numbers (x0, x1, x2, x3& y0, y1, y2, y3)are multiplied then
16
16
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Multiplication by shifting:
If x=(0010)2 = (2)10
If it is to be multiplied by 2, then we can shift x in left side. x = (0100)2 = (4)10
If it is to be divided by 2, then we can shift in right side. x = (0001)2 = (1)10.
So, shift register can be used for multiplication or division by 2.
17
17
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
X3 X2 X1 X0 Y1 Z0
HA FA FA HA
X3 X2 X1 X0 Y2 Z1
FA FA FA HA
X3 X2 X1 X0 Y3 Z2
FA FA FA HA
Z7 Z6 Z5 Z4 Z3
Figure: 4 x 4 array multiplier using Fulladder, Halfadder and AND gate.
18
18
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(iv) Booth (encoding) multiplier:
Booth’s algorithm is an efficient hardware implementation of a digital circuit that
multiplies two binary numbers in two’s complement notation.
Booth multiplication is a fastest technique that allows for smaller, faster multiplication
circuits, by recoding the numbers that are multiplied.
The Booths multipliers widely used in ASIC oriented products due to the higher
computing speed and smaller area.
In the binary number system, the digits called bits are to the set of {0,1}.
The result of multiplying any binary number by binary bit is either 0 or original number.
This makes the formation of partial products are more efficient and simple.
Then adding all these partial products is time consuming task for any binary multipliers.
The entire process consists of three steps partial product generation, partial product
reduction and addition of partial products as shown in figure.
But in booth multiplication, partial product generation is done based on recoding scheme
e.g. radix 2 encoding.
Bits of multiplicand (Y) are grouped from left to right and corresponding operation on
multiplier (X) is done in order to generate the partial product.
In radix-2 booth multiplication partial product generation is done based on encoding
which is as given by Table.
19
19
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
RADIX-2 PROCEDURE:
1) Add 0 to the LSB of the multiplier and make the pairing of 2 from the right to the left
which shown in the figure.
With suitable example and with detailed steps explain Radix-4 modified booth encoding for
an 8-bit signed multiplier. (Nov 2019)
These group of binary digits are according to the Modified Booth Encoding Table and it
is one of the numbers from the set of (-2,2,0,1,-1).
20
20
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
21
21
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
22
22
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(b) If there are two wires of the same weight left, input them into a half adder.
(c) If there is just one wire left and connects it to next layer.
The Wallace tree multiplier output structure is tree basis style. It reduces the number of
components and reduces the area.
The architecture of a 4 x4 Wallace tree multiplier is shown in figure.
Apply radix-2 booth encoding to realize a 4-bit signed multiplier for (-10)*(-11).
(April 2019-15M) [Apr/May 2022][Nov/Dec 2022]
Solution:
M= -10 =0110, Q= -11 =0101
A Q Q-1
Step-I: 0000 0101 0 :last 2 bits are10; A=A-M
1010 0101 0 : shift right
1101 0010 1
Step-II: 0011 0010 1 :last 2 bits are 01; A=A+M
0001 1001 0 :shift right
Step-III: 1011 1001 0 :last 2 bits are10; A=A-M
1101 1100 1 ;shift right
Step-IV: 1101 1100 1 ;last 2 bits are 01; A=A+M
0110 1110 0 ;shift right
****************************************************************************
4.7. DIVIDERS
There are two types of dividers, Serial divider and Parallel divider. Serial divider is slow
and parallel divider is fast in performance.
23
23
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Generally division is done by repeated subtraction. If 10/3 is to be performed then,
10 -3 =7, ( divisor is 3, dividend is 10)
7 – 3 = 4,
4–3=1
Here, repeated subtraction has been done, after 3 subtractions, the remainder is 1. It is
less than divisor. So now the subtraction is stopped.
Let see the example of binary division with use of 1’s complement method
1010 (10d) / 0011 (3d)
Step1: find 1’s complement of divisor
Step2: add this with the dividend
Step3: if carry is 1, then it is added with the output to get the difference output
Step4: the same procedure is repeated until we are get carry 0.
Step5: then the process is stopped.
1 0 1 0 (10)
24
24
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Y0 Y1 Y2 Y3 are complemented and given to 4 bit adder block (figure shown below)
X0 X1 X2 X3 are given to MUXs and MUX output is given to D flipflop. Select signal of
MUX is high. It is connected to clear input of counter.
Carry output of adder is connected with clock enable pin of counter. The same is given to
OR gate. The output of this OR gate is given to clock enable signal of flipflops.
The other input of OR gate is tied with select signal of MUX.
If X > Y, C0 of adder is high.
After first subtraction, the counter output is incremented by 1.
For each subtraction, the counter output is incremented.
If C0 of adder is low, then clock of counter and FF is disabled. Counting is stopped.
Q3 Q2 Q1 Q0 is the counter output (Quotient)
R3 R2 R1 R0 is the flipflop output (remainder)
25
25
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
******************************************************************************
4.8. SHIFT REGISTERS:
Design 4 input and 4 output barrel shifter using NMOS logic. (NOV 2018, Nov 2019).
List the several commonly used shifters. Design the shifter that can perform all the
commonly used shifters. [May 2021, NOV 2021]
Elaborate in detail the design of a 4-bit barrel shifter. [Nov/Dec 2022]
An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or right
shifting.
For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get y3 y 2 y 1 y 0 = a2 a1 a0a3
If it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1
Barrel Shifter:
A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in
oneclock cycle.
26
26
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
It can be implemented as a sequence of multiplexers(MUX), and in such an implementation
the output of one MUX is connected to the input of the next MUX in a way that depends on
the shift distance.
For example, take a four-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle the
order of the bits ABCD as DABC, CDAB, or BCDA; in this case, no bits are lost.
That is, it can shift all of the outputs up to three positions to the right (thus make any cyclic
combination of A, B, C and D).
The barrel shifter has a variety of applications, including being a useful component in
microprocessors (alongside the ALU).
27
27
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
For a floating-point add or subtract operation, requires shifting the smaller number to the
right, increasing its exponent, until it matches the exponent of the larger number.
This is done by using the barrel shifter to shift the smaller number to the right by the
difference, in one cycle.
If a simple shifter were used, shifting by n bit positions would require n clock cycles.
The disadvantages of FET array barrel shifter are the threshold voltage drop problem, parasitic
limited switching time problem.
The figure shown is known as a barrel shifter and a 8 x 4-bit barrel shifter circuit.
Logarithmic Shifter:
A Shifter with a maximum shift width of M consists of a log 2M stages, where the ith stage
either shifts over 2i or passes the data unchanged.
Maximum shift value of seven bits is shown in figure, to shift over five bits, the first stage is
set to shift mode, the second to pass mode and the last again to shift.
The speed of the logarithmic shifter depends on the shift width in a logarithmic wa, M-bit
shifter requires log2M stages.
The series connection of pass transistors slows the shifter down for larger shift values.
Advantage of logarithmic shifter is more effective for larger shift values in terms of both area
and speed.
******************************************************************************
4.9. SPEED AND AREA TRADE OFF:
Discuss the details about speed and area trade off. (May 2017)
Discuss trade-off between speed Vs area. [Nov/Dec 2022]
Adder:
The tradeoff in terms of power and performance is shown below.
The performance is represented in terms of the delay(speed).
The area estimations for each of the delays are given based on the fact that area is in
relation to the power consumption.
The area of a carry lookahead adder is larger than the area of a ripple carry for a
particular delay.
28
28
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
This is because the computations performed in a carry lookahead adder are parallel,
which requires a larger number of gates and also results in a larger area.
CLA –Carry Lookahead Adder, RC, R – Ripple carry adder
Figure: Area Vs Delay for 8 bit adder Figure: Area Vs Delay for 16 bit adder
Figure: Area Vs delay for 32 bit adder Figure: Area Vs delay for 64 bit adder
Figure: Delay Vs Area for all adders Figure: Area Vs Delay for all multiplier
Above figures shows that the delay of the ripple carry adder increases much faster when
compared to the carry lookahead adder as the number of bits is increased.
In the carry lookahead adder, the cost is in terms of the area because computations are in
parallel, and therefore more power is consumed for a specific delay.
29
29
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
30
30
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
If basic storage cell size is approximately square, then the design is extremely slow. The
vertical wire, which connects the storage cells to I/O will be excessively large.
So, memory arrays are organized in such a way that vertical and horizontal dimensions
are the same.
The words are stored in a row. These words are selected simultaneously.
The column decoder is used to route the correct word to the I/O terminals.
The row address is used to select one row of memory and column address is used to
select particular word from that selected row.
Word line: The horizontal select line which is used to select the single row of cell is
known as word line.
Bit line: The wire which connects the cell in a single column to the input/output circuit is
known as bit line.
Sense amplifier: It requiresanamplificationoftheinternalswingtofullrail-to-rail
amplitude.
Block address: the memory is divided into various small blocks.
The address which is used to select one of the small blocks to be read or written is known
as block address.
Advantages:
1. Access time is fast
31
31
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
2. Power saving is good, because blocks not activated are in power saving mode.
32
32
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Programming ROM
The transistor in the intersection of row and column is OFF when the associated word line is
LOW. In this condition, we get logic 1 output.
33
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
In floating gate transistor, extra polysilicon strip is used in between the gate and the channel
known as floating gate.
Floating gate doubles the gate oxides thickness and hence device transconductance is reduced
and threshold voltage is increased.
The threshold voltage is a programmable.
If high voltage is (>10V) is applied between the source terminals and gate-drain terminals,
then high electric field is generated. So, avalanche injection occurs.
After acquiring energy, electron becomes hot and transverse through the first oxide insulator .
They get trapped on the floated gate.
The floating gate transistor is known as floating gate avalanche injection MOS or FAMOS.
Disadvantage: High programming voltage is need.
EEPROM – E2PROM:
Electrically Erasable Programmable ROM. Here Floating gate tunneling oxide (FLOTOX) is
used.
It is similar to floating gate except that the portion of the floating gate is separated from the
channel at the thickness of 10nm or <10nm.
If 10V is applied, electron travels to and from the floating gate through Fowler-Nordheim
tunneling.
Erasing can be done by revering applied voltage which is used for writing.
34
34
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure: (a) Erase (b) Write (c) Read operation of NOR flash memory
35
35
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
4.10.3.3 RAM – Random Access Memory
Explain about static and dynamic RAM.
Construct 6T based SRAM cell. Explain its read and write operations. (NOV 2018)
[Nov/Dec 2022]
4.10.3.3.1Static RAM:
SRAM cell needs 6 transistors per bit.
M5 and M6 transistors are shared between read and write operations.
Bit line(BL) and inverse Bit Line signals are used to improve the noise margin during read
and write operations.
Read operation:
Let us assume logic 1 is stored at Q and BL and inverse BL are precharge to 2.5V before
starting read operation.
The read cycle is started by asserting word line then M5 and M6 transistors are enabled.
After the small initial word line delay then the values stored at Q and inverse Q are transferred
to the bit lines by leaving BL at 2.5V and the value at inverse Q is discharge through M1, M5.
36
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
37
37
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
In the compare mode, stored data are compared using bit line. The match line is
connected to all CAM blocks in a row. And it is initially precharged to V DD.
If there is some match occurs, then internal row is discharged. If even one bit in a row is
mismatched, then the match line is low.
*****************************************************************************
Column Decoder
It should matchthebitlinepitchofthememory array.
38
38
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
In column decoder, decoder outputs are connected to nMOS pass transistors.
By using this circuit, we can selectively drive one out of m pass transistors.
Only one nMOS pass transistor is ON at the time.
Amplification:
In memory structures such as the 1TDRAM, amplification is required for proper
functionality.
Delay Reduction:
The amplifier compensates for the fan-out driving capability of the memory cell by
detecting and amplifying small transitions on the bit line to large signal output
swings.
Power reduction:
Reducing the signal swing on the bit lines can eliminate large part of the power
dissipation related to charging a nd discharging the bit lines.
(iii) Drivers/ Buffers
The length of word and bit lines increases with increasing memory sizes.
Large portion o f the read and write access time can be attributed t o the
wire delays.
A major part of the memory-periphery area is allocated to the drivers (address
buffers and I/O drivers).
******************************************************************************
39
39
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
4.12: Low Power Memory design:
Discuss about Low power memory design. [Apr/May 2022]
Elucidate in detail low power SRAM circuit. (April 2019-13M) (Nov 2019)
Figure: (a) Insertion of low threshold device (b) Reducing supply Voltage
******************************************************************************
40
40
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Programmable devices (Programmable ASIC):
Programmable devices can be divided into three areas
1. Programmable logic structure
2. Programmable interconnect
3. Reprogrammable Gate array
A programmable logic device (PLD) is an electronic component used to build
reconfigurable digital circuits.
Unlike a logic gate, which has a fixed function, a PLD has an undefined function at the
time of manufacture.
1. Programmable Logic Structure:
Describe in detail the chip with programmable logic structures. (Nov 2009)
(a) Programmable Logic Array:
Programmable logic arrays (PLAs) is a type of fixed architecture logic devices with
programmable AND gates followed by programmable OR array.
Logic array is the structure unit which can be programmed to perform various functions.
Programmable Logic Array (PLA) can be implemented as AND-OR plane devices.
Structure of AND-OR PLA is shown below.
41
41
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure: PLA with three inputs, four product terms and two outputs
(b) PAL (Programmable Array Logic) Architecture:
The PAL is a programmable logic device with a fixed OR array and a programmable
AND array.
42
42
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
43
43
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Discuss the different types of programming technology used in FPGA design. (NOV 2016)
Draw and explain the operation of metal-metal antifuse and EPROM transistor. (June 2012)
44
44
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
ANTIFUSE:
In FPGA, the device is programmed by changing the characteristic of switching element
(or) we can write the program for routing.
Programming routing can be explained by using the product of ACTEL, Quick Logic
Companies etc.
In ACTEL, interconnect is done by PLICE (or) Antifuse.
PLICE means Programmable Low Impedance Circuit Element.
Antifuse is high resistance (>100MΩ) is changed into low resistance (200-500Ω) by
applying programming voltage.
It consists of ONO (Oxide-Nitride-Oxide) layer which is sandwiched between
polysilicon layer and n+ diffusion.
Antifuses separate interconnect wires on the FPGA chip and the programmer blows an
antifuse to make a permanent connection.
Once an antifuse is programmed, the process can’t be reversed. This is an OTP
Technology.
45
45
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure: Metal-metal anti-fuse
Advantages of metal-metal antifuse:
Advantages of metal-metal antifuse over poly diffusion antifuse are:
1. The connections are direct to metal wiring layers.
2. It is easier to use larger programming currents to reduce the antifuse resistance.
UV-Erasable programming:
Find the reason for referring EPROM technology as floating gate avalanche MOS. (Dec. 2013)
EPROM programming:
In this type floating gate transistor is used.
We can reprogram by using UV-light.
High electric field causes electrons flowing towards drain to move across the insulating
gate oxide, where they trapped on the bottom, floating gate.
These energetic electrons are HOT and this effect is known as Hot-electron injection (or)
avalanche injection.
EPROM technology is sometimes called floating –gate avalanche MOS (FAMOS).
EEPROM programming:
Electrically Erasable programming is most popular CMOS technology.
A very thin oxide between floating gate and the drain allow the electrons to tunnel to or
from the floating gate (gate is charged or discharged).
Thus enabling writing and erasing operation.
Advantages:
The advantages of EEPROM technology are:
faster than using a UV lamp
chips do not have to be removed from the system
if the system contains circuits to generate both program and erase voltages, it may
use ISP
46
46
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
SRAM Programming
SRAM programming is shown in figure.
SRAM configuration cell is constructed from two cross-coupled inverters and uses a
standard CMOS process.
The configuration cell drives the gates of other transistors on the chip (using pass
transistors or transmission gates) to make a connection or off to break a connection.
The cell is programmed using the WRITE and DATA lines.
47
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
The basic cell structure for FPGA is complicated than the basic cell structure of standard
gate array.
The programmable logic blocks of FPGA are called Configurable Logic Block (CLB).
The FPGA architecture consists of three types of configurable elements-
(i) IOBs –Input/output blocks
(ii) CLBs- Configurable logic blocks
(iii) Resources for interconnection
The IOBs provide a programmable interface between the internal, array of logic blocks
(CLBs) and the device’s external package pins.
CLBs perform user-specified logic functions.
The interconnect resources carry signals among the blocks.
A configurable program stored in internal static memory cells.
Configurable program determines the logic functions and the interconnections.
The configurable data is loaded into the device during power-up reprogramming function.
FPGA devices are customized by loading configuration data into internal memory cells.
1.Logic blocks
Based on memories (Flip-flop & LUT – Lookup Table) Xilinx
Based on multiplexers (Multiplexers)-Actel
Based on PAL/PLA - Altera
Transistor Pairs
2. Interconnection Resources
Symmetrical FPGA-s
Row-based FPGA-s
Sea-of-gates type of FPGA-s
Hierarchical FPGA-s (CPLD)
3. Input-output cells (I/O Cell)
Possibilities for programming :
a. Input
b. Output
c. Bidirectional
48
48
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
RE-PROGRAMMABLE DEVICE ARCHITECTURE:
49
49
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Interconnection resources:
50
50
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
When the output enable, OE is ‘1’ the output section is enabled and drives the I/O pad.
When OE is ‘0’ the output buffer is placed in a high-impedance state.
Give short notes on FPGA interconnect routing procedures. (May 2016, May 2021)
Describe FPGA interconnect routing resources with neat diagram. (April 2019-13M)
Give a note on standard cell design and FPGA interconnecting resources. (Nov 2019)
[Apr/May 2022]
51
51
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Connections between the logic blocks within a group can be made using wire segments at
the lowest level of the routing hierarchy.
Connections between the logic blocks in distant groups require the traversal of one or
more levels of routing segments.
As shown in Figure, only one level of routing directly connects to the logic blocks.
Programmable connections are represented with the crosses and circles.
52
52
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
53
53
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
54
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(e) Actel Routing Architecture:
Actel's design has more wire segments in horizontal direction than in vertical direction.
The input pins connect to all tracks of the channel that is on the same side as the pin.
The output pins extend across two channels above the logic block and two channels
below it.
Output pin can be connected to all 4 channels that it crosses.
The switch blocks are distributed throughout the horizontal channels.
All vertical tracks can make a connection with every incidental horizontal track.
This allows for the flexibility that a horizontal track can switch into a vertical track, thus
allowing for horizontal and vertical routing of same wire.
The drawback is more switches are required which add up to more capacitive load.
55
55
INTERCONNECT:
➢ An electronic circuit designer has multiple choices in realizing the interconnections between
the various devices that make up the circuit.
➢ Here the start of the art processes offers multiple layers of aluminium or copper, and at least
one layer of polysilicon. Even the heavily doped n+ and p+ diffusion layers are typically used for
the realization of source and drain regions can be employed for wiring purposes. These wires
appear in the schematic diagrams of electronic circuit as simple lines with no apparent impact
on the circuit performance.
These wiring of integrated circuits forms a complex geometry that introduces the following
parasitics:
1. Capacitive Parasitics
2. Resistive Parasitics and
3. Inductive Parasitics
The capacitive, resistive and inductive parasitics have multiple effects of the circuit’s behaviour
i.e.
It is important that the designer has a clear insight in the parasitic wiring effects, their relative
importance, and their models. This is best illustrated with the simple example as shown above.
Each wire in a bus network connects a transmitter (or transmitters) to a set of receivers and is
implemented as a link of wire segments of various lengths and geometries. Assume that all
segments are implemented on a single interconnect layer, isolated from the silicon substrate
and from each other by a layer of dielectric material. Be aware that the reality may be far more
complex.
56
56
Analyzing the behavior of this schematic, which only models a small part of the circuit, is slow
and cumbersome. Fortunately, substantial simplifications can often be made, some of which are
enumerated below,
➢ Inductive effects can be ignored if the resistance of the wire is substantial- this is for instance
the case for long Aluminum wires with a small cross-section- or if the rise and fall times of the
applied signals are slow.
➢ When the wires are short, the cross-section of the wire is large, or the interconnect material
used has a low resistivity, a capacitance- only model can be used (figure FOR WIRE
PARASITICS WITH CAPACITANCE ONLY shown below)
➢ The separation between neighbouring wires is large, or when the wires only run together for a
short distance, inter-wire capacitance can be ignored, and all the parasitic capacitance can be
modeled as capacitance to ground. Obviously, the latter problems are the easiest to model,
analyze, and optimize.
WIRE PARASITICS (WITH THE EXCEPTION OF INTER-WIRE RESISTANCE AND MUTUAL INDUCTANCE) WIRE PARASITICS WITH
CAPACITANCE ONLY
WIRE MODELS FOR PARASITICS
The various interconnect parameters whose values can be estimated, simple models to evaluate
their impact, and a set of rules- of- thumb to decide i.e. when and where a particular model or
effect should be consideredare:
1. Capacitance Parameter
2. Resistance Parameter
3. Inductance Parameter
The capacitance of such a wire is a function of its shape, its environment, its distance to the
substrate, and the distance to surrounding wires. An accurate modeling of the wire
capacitance(s) in a state-of-the-art integrated circuit is a non-trivial task and is even today the
subject of advanced research.
In capacitance parameter there are two types of capacitance occurring i.e.
1. Parallel plate Capacitance and
2. Fringe Capacitance
57
57
Parallel plate Capacitance:
Consider first a simple rectangular wire placed above the semiconductor substrate, as shown in
figure below. If the width of the wire is substantially larger than the thickness of the insulating
material, it may be assumed that the electrical-field lines are orthogonal to the capacitor plates,
and that its capacitance can be modeled by the parallel-plate capacitance model. Under those
circumstances, the total capacitance of the wire can be approximated as,
Where W and L are respectively the width and length of the wire, and tdi and εdi represent the
thickness of the dielectric layer and its permittivity. SiO2 is the dielectric material of choice in
integrated circuits, although some materials with lower permittivity, and hence lower
capacitance, are coming in use.
Fringing fields/ the fringing-field capacitance model of fringing-field capacitance- decomposes the
Capacitance into two contributions: a parallel-plate
capacitance, and a fringing capacitance, modeled by a cylindrical
wire with a diameter equal to the thickness of the wire
58
Therefore, the parallel plate capacitance and fringing capacitance constitutes the overall
capacitance. Which is given as,
With w = W - H/2 a good approximation for the width of the parallel-plate capacitor.
Assuming that a wire is completely isolated from its surrounding structures and is only
capacitively coupled to ground, becomes untenable. This is illustrated in figure, where the
capacitance components of a wire embedded in an interconnect hierarchy are identified. Each
wire is not only coupled to the grounded substrate, but also to the neighbouring wires on the
same layer and on adjacent layers. The main difference is that not all its capacitive components
do terminate at the grounded substrate, but that a large number of them connect to other wires,
which have dynamically varying voltage levels, these floating capacitors causes crosstalk and a
Inter- wire capacitances become a dominant factor in multi- layer interconnect structures. This
effect is more important for wires in the higher interconnect layers, as these wires are farther
away from the substrate. The increasing contribution of the inter- wire capacitance to the total
capacitance with decreasing feature sizes is illustrated by graphical figure as shown, which plots
the capacitive components of a set of parallel wires routed above a ground plane, it is assumed
that dielectric and wire thickness are held constant while scaling all other dimensions. When W
becomes smaller than 1.75 H, the inter-wire capacitance starts to dominate.
59
Wiring Capacitances for 0.25 µm CMOS Technology:
The table rows represent the top plate of the capacitor, the columns the bottom plate. The area
capacitancesare expressed in aF1/µm2, while the fringe capacitances (given in the shaded rows)
are in aF/µm.
Inter- Wire Capacitance per unit wire length for different interconnect layers of 0.25
µm CMOSTechnology Process:
The capacitances are expressed in aF/mm, and are for minimally-spaced wires
The resistance of a wire is proportional to its length L and inversely proportional to its
cross- section A.The resistance of a rectangular conductor as shown in figure below can be
expressed as,
60
Where:
ρ = resistivity
A = HW = area of cross section of the rectangular wireIf L = W, i.e. square of resistive material,
then
the sheet resistance of the material, having units of Ω/ sq. This expresses that the resistance
of a square conductor is independent of its absolute size, as is apparent from
To obtain the resistance of a wire, simply multiply the sheet resistance by its ratio (L/ W).
Aluminum is the interconnect material most often used in integrated circuits because of its low
cost and its compatibility with the standard integrated- circuit fabrication process.
Unfortunately, it has a large resistivity compared to materials such as Copper. With ever-
increasing performance targets, this is rapidly becoming a liability and top- of- the- line
processes are now increasingly using Copper as the conductor of choice.
Typical values of the Sheet Resistance of various Interconnect Materials using 0.25 µm
CMOS Technology:
61
From the table, we conclude that Aluminum is the preferred material for the wiring of long
interconnections. Polysilicon should only be used for local interconnect. Although the sheet
resistance of the diffusion layer (n+, p+) is comparable to that of polysilicon, the use of
diffusion wires should be avoided due to its large capacitance and the associated RC delay.
The inductance of a section of a circuit states that a changing current passing through an
inductor generates avoltage drop ΔV.
On-chip inductance include ringing and overshoot effects, reflections of signals due to
impedance mismatch, inductive coupling between lines, and switching noise due to Ldi/dt
voltage drops.
It is possible to compute the inductance a wire directly from its geometry and its environment.
A simpler approach relies on the fact that the capacitance c and the inductance l (per unit
length) of a wire are relatedby the following expression,
With and µ respectively the permittivity and permeability of the surrounding dielectric.
Other interesting relations, obtained from Maxwell’s laws, can be pointed out. The constant
product of permeability and permittivity also defines the speed at which an electromagnetic
wave can propagate through the medium,
62
Dielectric constants and wave-propagation speeds for various materials used in electronic
circuits;(The relative permeability µr of most dielectrics is approximately equal to 1)
INTERCONNECT MODELING:
Lumped Model:
The circuit parasitics of a wire are distributed along its length and are not lumped into a
single position. Yet, when only a single parasitic component is dominant, when the interaction
between the components is small, or when looking at only one aspect of the circuit behaviour, it
is often useful to lump the different fractions into a single circuit element. The advantage of this
approach is that the effects of the parasitic then can be described by an ordinary differential
equation.
As long as the resistive component of the wire is small and the switching frequencies are in the
low to medium range, it is meaningful to consider only the capacitive component of the wire,
and to lump the distributed capacitance into a single capacitor as shown in figure. It is observed
that in this model the wire still represents an equipotential region, and that the wire itself does
not introduce any delay. The only impact on performance is introduced by the loading effect of
the capacitor on the driving gate. This capacitive lumped model is simple, yet effective, and is
the model of choice for the analysis of most interconnect wires in digital integrated circuits.
63
DISTRIBUTED VERSUS LUMPED CAPACITANCE MODEL OF WIRE. CLUMPED = L×CWIRE, WITH L THE
LENGTH OF THE WIRE AND CWIRE THE CAPACITANCE PER UNIT LENGTH. THE DRIVER IS
MODELED AS A VOLTAGE SOURCE AND A SOURCE RESISTANCE RDRIVER
The operation of this simple RC network is described by the following ordinary differential
equation,
On-chip metal wires of over a few mm length have a significant resistance. The equipotential
assumption, presented in the lumped-capacitor model, is no longer adequate, and a resistive-
capacitive model has to be adopted.
A first approach lumps the total wire resistance of each wire segment into one single R and
similarly combines the global capacitance into a single capacitor C. This simple model, called
the lumped RC model, is pessimistic and inaccurate for long interconnect wires, which are more
adequately represented by a distributed rc-model. Yet, before analyzing the distributed model, it
is worthwhile to spend some time on the analysis and the modeling of lumped RC networks for
the following reasons:
1. The distributed rc-model is complex and no closed form solutions exist. The behaviour of the
distributed rc- line can be adequately modeled by a simple RC network.
2. A common practice in the study of the transient behavior of complex transistor-wire networks
is to reduce the circuit to an RC network. Having a means to analyze such a network effectively
and to predict its first-order response would add a great asset to the designers tool box.
64
An interesting result of this particular circuit topology is that there exists a unique resistive path
between the source node s and any node i of the network. The total resistance along this path is
called the path resistance Rii. For example, the path resistance between the source node s and
node 4 equals,
The definition of the path resistance can be extended to address the shared path resistance Rik,
which represents the resistance shared among the paths from the root node s to nodes k and i:
Here,
Ri4 = R1 + R3 while Ri2 = R1
Assume now that each of the N nodes of the network is initially discharged to GND, and that a
step input is applied at node s at time t = 0. The Elmore delay at node i is then given by the
following expression:
Therefore, the Elmore delay is equivalent to the first-order time constant of the network (or the
first moment of the impulse response). The designer should be aware that this time- constant
represents a simple approximation of the actual delay between source node and node i. Yet in
most cases this approximation has proven to be quite reasonable and acceptable. It offers the
designer a powerful mechanism for providing a quick estimate of the delay of a complex
network.
The RC delay of a tree structured network is given as,
i.e. using
As
65
RC CHAIN MODEL
The component of node 1 consists of C1R1 with R1 the total resistance between the node and the
source, while the contribution of node 2 equals C2(R1 + R2). The equivalent time
constant at node 2 equalsC1R1 + C2(R1 + R2). i of node i can be derived in a similar way.
Thus, the Elmore delay formula has proven to be extremely useful. Besides making it possible
to analyze wires, the formula can also be used to approximate the propagation delay of complex
transistor networks. The evaluation of the propagation delay is then reduced to the analysis of
the resulting RC network. More precise minimum and maximum bounds on the voltage
waveforms in an RC tree have further been established.
66
The voltage at node i of this network can be determined by solving the following set of partial
differential equations:
equation:
Where V is the voltage at a particular point in the wire, and x is the distance between this point
and the signal source. No closed form solution exists for this equation, but approximative
expressions such as the formula written below can be derived:
The graph below shows the response of a wire to a step input, plotting the waveforms at
different points in the wire as a function of time. It is observable how the step waveform
“diffuses” from the start to the end of the wire, and the waveform rapidly degrades, resulting in
a considerable delay for long wires. Driving these rc lines and minimizing the delay and signal
degradation is one of the trickiest problems in modern digital integrated circuit design.
rc delays should only be considered when the rise (fall) time at the line input is smaller
than RC, the rise (fall) time of the line.
With R and C the total resistance and capacitance of the wire. When this condition is not met,
the change in signal is slower than the propagation delay of the wire, and a lumped capacitive
model suffices.
67
SIMULATION π AND T MODELS FOR DISTRIBUTED RC LINE
The transmission line has the prime property that a signal propagates over the interconnection
medium as a wave. This is in contrast to the distributed rc model, where the signal diffuses from
the source to the destination governed by the diffusion equation i.e.
In the wave mode, a signal propagates by alternatively transferring energy from the electric to
the magnetic fields, or equivalently from the capacitive to the inductive modes.
68
A LOSSY TRANSMISSION LINE
Consider the point x along the transmission line of figure as shown above at time t. The
following set of equations holds:
Assuming that the leakage conductance g equals 0, which is true for most insulating materials,
and eliminating the current i yields the wave propagation equation,
where r, c, and l are the resistance, capacitance, and inductance per unit length respectively.
As till now we have concentrated on the growing impact of interconnect parasitics on all design
metrics of digital integrated circuits. As mentioned, interconnect introduces three types of
parasitic effects i.e. capacitive, resistive, and inductive- all of which influence the signal
integrity and degrade the performance of the circuit. While so far we have concentrated on the
modeling aspects of the wire, we now analyze how interconnect affects the circuit operation,
and we present a collection of design techniques to cope with theseeffects with considering each
parasitics- this is referred to as coping with interconnect.
CAPACITIVE PARASITICS:
➢ Capacitance reliability and Cross talk:
An unwanted coupling from a neighbouring signal wire to a network node introduces an
interference that is generally called cross talk. The resulting disturbance acts as a noise source
and can lead to hard-to-trace intermittent errors, since the injected noise depends upon the
transient value of the other signals routed in the neighbourhood. In integrated circuits, this inter
signal coupling can be both capacitive and inductive.
Capacitive cross talk is the dominant effect at current switching speeds, although inductive
coupling forms a major concern in the design of the input-output circuitry of mixed-signal
circuits. The potential impact of capacitive crosstalk is influenced by the impedance of the line
under examination. If the line is floating, the
69
disturbance caused by the coupling persists and may be worsened by subsequent switching
on adjacentwires. If the wire is driven, on the other hand, the signal returns to its original level.
o Floating Lines:
Considering the circuit shown as above, where line X is coupled to wire Y by a parasitic
capacitance CXY. Line Y sees a total capacitance to ground equal to CY. Assuming that the
voltage at node X experiences a step change equal to ΔVX. This step appears on node Y
attenuated by the capacitive voltage divider.
Circuits that are particularly susceptive to capacitive cross talk are networks with low- swing
pre chargednodes, located in adjacent to full- swing wires (with ΔVX = VDD).
Examples of these are dynamic memories, low swing on chip busses and some dynamic
families.To address the cross talk issue, level- restoring device or keepers are a must in dynamic
logic.
o Driven Lines:
70
As seen from the figure, if the line Y is driven with a resistance RY, a step on line X results in
a transient on line Y. The transient decays with a time constant τXY = RY (CXY + CY ). The
actual impact on the victim line is a strong function of the rise- fall time of the interfering
signal.
If the rise time is comparable or larger than the time constant, the peak value of disturbance is
diminished. This can be observed in the response figure.
Obvious, keeping the driving impedance of a wire and hence τXY low goes a long way towards
reducing the impact of capacitive cross talk. The keeper transistor added to a dynamic gate or
pre charged wire is an excellent example of how impedance reduction helps to control noise.
Therefore, the impact of cross talk on the signal integrity of driven nodes is rather limited. The
resulting glitches may cause malfunctioning of connecting sequential elements, and should
therefore be carefully monitored. The most important effect is an increase in delay.
1. If possible avoid floating nodes, nodes sensitive to cross talk problems such as pre charged
busses,should be equipped with keeper devices to reduce the impedance.
2. Sensitive nodes should be well separated from full swing signals.
3. Making the rise- fall time as large as possible subjection to timing constraints.
4. Use differential signalling in sensitive low swing wiring networks. This turns the cross talk
signal into a common mode noise source that does not impact the operation of the circuit.
5. To keep the cross talk minimum, do not allow the capacitance between the two signal wires to
grow too large.
6. If necessary provide shielding wire- GND or VDD between the two signals as show below. This
effectively turns the interwire capacitance into a capacitance to ground and eliminates
interference. An adverse effect of shielding is the increased capacitive load.
7. The interwire capacitance between signals on different layers can be further reduced by
addition of extra routing layers.
71
➢ Impact of Cross talk on Propagation Delay (With respect to CMOS):
The circuit schematic illustrates of how capacitive cross talk may result in a data-dependent
variation of the propagation delay. Assume that the inputs to the three parallel wires X, Y, and Z
experience simultaneous transitions. Wire Y (called the victim wire) switches in a direction that
is opposite to the transitions of its neighbouring signals X and Z. The coupling capacitances
experience a voltage swing that is double the signal swing, and hence represent an effective
capacitive load that is twice as large as Cc- the by now well known Miller effect.
Since the coupling capacitance represents a large fraction of the overall capacitance in the deep-
submicron dense wire structures, this increase in capacitance is substantial, and has a major
impact on the propagation delay of the circuit. Observe that this is a worst-case scenario. If all
inputs experience a simultaneous transition in the same direction, the voltage over the coupling
capacitances remains constant, resulting in a zero contribution to the effective load capacitance.
The total load capacitance CL of gate Y, hence depends upon the data activities on the
neighbouring signals and varies between the following bounds:
with CGND the capacitance of node Y to ground, including the diffusion and fan out capacitances.
72
Design Techniques for Circuit Fabrics with Predictable delay:
With cross talk making wire-delay more and more unpredictable, a designer can choose
between a numberof different methodology options to address the issue, some of which are,
1. Evaluate and improve: After detailed extraction and simulation, the bottlenecks in
delay are identified,and the circuit is appropriately modified.
2. Constructive layout generation: Wire routing programs take into account the effects of
the adjacent wires,ensuring that the performance requirements are met.
3. Predictable structures: By using predefined, known, or conservative wiring structures,
the designer is that the circuit will meet his specifications and that cross talk will not be a show
stopper.
Typical examples of large on-chip loads are busses, clock networks, and control wires. The
latter include, for instance, reset and set signals. These signals control the operation of a large
number of gates, so fan-out is normally high. Other examples of large fan-outs are encountered
in memories where a large number of storage cells is connected to a small set of control and
data wires.
The capacitance of these nodes is easily in the multi-pico farad range. The worst case occurs
when signals go off-chip. In this case, the load consists of the package wiring, the printed circuit
board wiring, and the input capacitance of the connected ICs or components.
Typical off-chip loads range from 20 to 50 pF, which is multiple thousand times larger than a
standard on- chip load. Driving those nodes with sufficient speed becomes one of the most
crucial design problems.
The main secrets to the efficient driving of large capacitive loads are:
1. Adequate transistor sizing is instrumental when dealing with large loads.
2. Partitioning drivers into chains of gradually-increasing fers helps to deal with large fan out
factors.
73
RESISTIVE PARASITICS:
EVOLUTION OF POWER SUPPLY CURRENT AND SUPPLY VOLTAGE OHMIC VOLTAGE DROP ON THE SUPPLY
REDUCES NOISE MARGIN
Consider a 2 cm long VDD or GND wire with a current of 1mA per µm width. This current is
about the maximum that can be sustained by an aluminum wire due to electromigration
and assuming asheet resistance of 0.05 Ω/sq, the resistance of this wire (per µm width) equals
1 kΩ. A current of 1 mA/µm would result in a voltage drop of 1 V. The altered value of the
voltage supply reduces noise margins and changes the logic levels as a function of the distance
from the supply terminals. This is demonstrated by the circuit shown above, where an inverter
placed far from the power and ground pins connects to a devicecloser to the supply.
The difference in logic levels caused by the IR voltage drop over the supply rails might partially
turn on transistor M1. This can result in an accidental discharging of the pre charged,
dynamic node X, or causestatic power consumption if the connecting gate is static. In short, the
current pulses from the on-chip logic, memories and I/O pins cause voltage drops over the
power- distribution network and are the major source for on- chip power supply noise. Beyond
causing a reliability risk, IR drops on the supply network also impact the performance of the
system. A small drop in the supply voltage may cause a significant increase indelay.
The most obvious problem is to reduce the maximum distance between the supply pins and the
circuit supply connections which is most easily accomplished through a structured layout of the
power distribution network. A number of on- chip power distribution networks with peripheral
bonding.
➢ Electromigration:
The current density (current per unit area) in a metal wire is limited due to an effect called
electromigration. A direct current in a metal wire running over a substantial time period,
causes a transport of the metal ions. Eventually, this causes the wire to break or to short
circuit to another wire. This type of failure will only occur after the device has been in use for
some time.
74
Line Open Failure Open Failure in Contact Plug
ELECTROMIGRATION RELATED FAILURE MODES
The rate of the electromigration depends upon the temperature, the crystal structure, and the
average current density. The latter is the only factor that can be effectively controlled by the
circuit designer. Keeping the current below 0.5 to 1 mA/ µm normally prevents migration. This
parameter can be used to determine the minimal wire width of the power and ground network.
Signal wires normally carry an ac- current and are less susceptible to migration. The
bidirectional flow of the electrons tends to anneal any damage done to the crystal structure.
Most companies impose a number of strict wire-sizing guidelines on their designers, based on
measurements and past experience.
Electromigration effects are proportional to the average current flow through the wire, while IR
voltage drops are a function of the peak current.
From designing point of view, at the technology level, a number of precautions can be taken
to reduce themigration risk i.e.
1. To add alloying elements (such as Cu or Tu) to the aluminum to prevent the movement of
the Alions.
2. To control the granularity of the ions.
3. The introduction of new interconnect materials is a big help as well. For instance, the use of
Copper interconnect increases the expected lifetime of a wire with a factor of 100 over Al.
75
Providing accurate synchronization and correct operation becomes a major challenge under
these circumstances. Therefore the different design techniques to cope with the delay imposed
by the resistance ofthe wire are,
Here the designer should be aware that these new materials only provide a temporary respite of
one or two generations, and do not solve the fundamental problem of the delay of long wires.
Innovative design techniques are often the only way of coping with the latter.
Sometimes, it is hard to avoid the use of long polysilicon wires. A good example of such
circumstance are the address lines in memories, which must connect to a large number of
transistor gates. Keeping the wires in polysilicon increases the memory density substantially by
avoiding the overhead of the extra metal contacts. The polysilicon- only option unfortunately
leads to an excessive propagation delay. One possible solution is to drive the word line from
both ends, as shown in Figure. This effectively reduces the worst-case delay by a factor of four.
Another option is to provide an extra metal wire, called a bypass, which runs parallel to the
polysilicon one, and connects to it every k cells as shown in figure. The delay is now dominated
by the much shorter polysilicon segments between the contacts. Providing contacts only every k
76
o Better Interconnect Strategies:
The length of the wire being a prime factor in both the delay and the energy consumption of an
interconnectwire, any approach that helps to reduce the wire length is bound to have an essential
impact.
There are two wiring strategies i.e. the Manhattan- Style Routing and Diagonal- Style Routing.
➢ In Manhattan style routing, interconnections are first routed along the one of the preferred
directions,followed by a connection in the other direction as shown.
➢ In Diagonal style routing less size of the wire length is required, on comparison to Manhattan
29% in best case. And the use of 45°lines is ironical in integrated circuits. The main issues of
diagonal routing are its complexity, impact on tools and masking concerns.
Earlier Manhattan routing was preferred because of the issues of diagonal routing inspite of
its features. Now diagonal routing is preferred due to its features i.e. less wire length and 45°
lines, its issues of complexity, impact on tools and masking concerns are easily overcomed
nowadays by using CAD tools (Computer Aided Design Tools) like Cadence. Therefore the
impact on wiring is quite tangible, a reduction of 20% in wire length, resulting in higher
performance, lower power dissipation and smaller chip area.
o Introducing Repeaters/ Buffer Insertion for very long wires:
The most popular design approach to reducing the propagation delay of long wires is to
introduce intermediate buffers, also called repeaters, in the interconnect line as shown below.
77
Making an interconnect line m times shorter reduces its propagation delay quadratically, and is
sufficient to offset the extra delay of the repeaters when the wire is sufficiently long. Assuming
that the repeaters have a fixed delay tpbuf , we can derive the delay of the partitioned wire.
𝜕 𝑡𝑝
The optimal number of buffers that minimizes the overall delay can be found by setting = 0,
𝜕𝑚
and is obtained when the delay of the individual wire segments is made equal to that of a
repeater.
Long wires hence often exhibit a delay that is longer than the clock period of the design. For
instance, the 10 cm long Al wire of comes with a minimum delay of 4.7 nsec, even after optimal
buffer insertion and sizing, while the 0.25 m CMOS process featured in this text can sustain
clock speeds in excess of 1 GHz (this is, clock periods below 1 nsec). The wire delay all-by-
itself hence becomes the limiting factor on the performance achievable by the integrated circuit.
The only way to address this bottleneck is to tackle it at thesystem architecture-level.
The wire is partitioned in k segments by inserting registers or latches. While this does not
reduce the delay through the wire segment, it takes k clock cycles for a signal to proceed
through the wire, it helps to increase its throughput, as the wire is handling k signals
simultaneously at any point in time. The delay of the individual wire segments can further be
optimized by repeater insertion, and should be below a single clock period.
78
This is only one example of the many techniques that the chip architect has at her disposal to
deal with the wire delay problem. The most important concern from this is that the wires have to
be considered early on in the design process, and can no longer be treated as an afterthought as
was most often the case in the past.
INDUCTIVE PARASITICS:
Interconnect wires also exhibit an inductive parasitic. An important source of parasitic
inductance is introduced by the bonding wires and chip packages. Even for intermediate- speed
CMOS designs, the current through the input- output connections can experience fast transitions
that cause voltage drops as well as ringing and overshooting, phenomena not found in RC
circuits. At higher switching speeds, wave propagation and transmission line effects can come
into the picture.
➢ Inductance and Reliability- 𝑳 𝒅𝒊 𝑽𝒐𝒍𝒕𝒂𝒈𝒆 𝑫𝒓𝒐𝒑:
𝒅𝒕
During each switching action, a transient current is sourced from (or sunk into) the supply rails
to charge (or discharge) the circuit capacitances as shown. Both VDD and VSS connections are
routed to the external supplies through bonding wires and package pins and possess a non
ignorable series inductance. Hence, a change in the transient current creates a voltage difference
between the external and internal (V’DD, GND’) supply voltages. This situation is especially
severe at the output pads, where the driving of the large external capacitances generates large
current surges. The deviations on the internal supply voltages affect the logic levels and result in
reduced noise margins.
In an actual circuit, a single supply pin serves a large number of gates or output drivers. A
simultaneous switching of those drivers causes even worse current transients and voltage drops.
As a result, the internal supply voltages deviate in a substantial way from the external ones. For
instance, the simultaneous switching of the 16 output drivers of an output bus would cause a
voltage drop of at least 1.1 V if the supply connections of the buffers were connected to the
same pin on the package. Improvements in packaging technologies are leading to ever-
increasing numbers of pins per package. Packages with up to 1000 pins are currently available.
Simultaneous switching of a substantial number of those pins results in huge spikes on the
supply rails that are bound to disturb the operation of the internal circuits as well as other
external components connected to the same supplies.
79
Design techniques to address
1. Separate pins for I/O pads and chip core. Since the I/O drivers require the largest switching
currents, they also cause the largest current changes. Therefore, it is wise to isolate the core of
the chip where most of the logic action occurs, from the drivers by providing different power
and ground pins.
2. Multiple power and ground pins in order to reduce the per supply pin, we can restrict the
number of I/O drivers connected to a single supply pin.
3. Careful selection of positions of the power and ground pins on the package. The inductance of
pins located at the corners of the package is substantially higher as shown below.
THE INDUCTANCE OF A BONDING WIRE/ PIN COMBINATION DEPENDS UPON THE PIN
POSITIONS
DECOUPLING CAPACITORS ISOLATE THE BOARD INDUCTANCE FROM THE BONDING WIRE AND
IN INDUCTANCE
80
➢ Inductance and Performance- Transmission Line Effects:
When an interconnection wire becomes sufficiently long or when the circuits become
sufficiently fast, the inductance of the wire starts to dominate the delay behaviour, and
transmission line effects must be considered. This is more precisely the case when the rise and
fall times of the signal become comparable to the time of flight of the signal waveform across
the line as determined by the speed of light. As advancing technology increases line lenghts and
switching speeds, this situation is gradually becoming common in fastest CMOS circuits as
well, and transmission- line effects are bound to become a concern of the CMOS designer as
well.
Some of the techniques to minimize the impact of the transmission line behaviour are:
o Termination:
To avoid the negative effects of transmission-line behaviour such as ringing or slow
propagation delays, the line should be terminated, either at the source (series termination), or at
the destination (parallel termination) with a resistance matched to its characteristic impedance
Z0.
The two scenarios- series and parallel termination as shown are depicted in figure. Series
termination requires that the impedance of the signal source is matched to the connecting wire.
This approach is appropriate for many CMOS designs, where the destination load is purely
capacitive. The impedance of the driver inverter can be matched to the line by careful transistor
sizing.
o Shielding
If we want to control the behaviour of a wire behaving as a transmission line, we should
carefully plan and manage how the return current flows. A good example of a well-defined
transmission line is the coaxial cable, where the signal wire is surrounded by a cylindrical
ground plane. To accomplish similar effects on a board or on a chip, designers often surround
the signal wire with ground (supply) planes and shielding wires. Being shielding, adding
shielding makes the behaviour and the delay of an interconnection a lot more predictable. Yet
even with these precautions, powerful extraction and simulation tools will be needed in the
future for the high-performance circuit designer.
81
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
UNIT – IV
INTERCONNECT, MEMORY ARCHITECTURE AND ARITHMETICCIRCUITS
TWO MARK QUESTIONS AND ANSWERS
1. What is meant by data path circuits? (APR 2016)
• Data path circuits are meant for passing the data from one segment to other segment for
processing or storing.
• The data path is the core of processors, where all computations are performed.
3. Draw the circuit for 4 bit ripple carry adder. (NOV 2018)
4. Write the equation for total delay in 4 bit ripple carry adder.
The total delay using the following equation,
t4b = td (cin → S3) +2td (cin → cout) + td (a0, b0 → c1)
5. Write the equation for worst case delay in 4 bit ripple carry adder.
If it is extend to n-bit, then the worst case delay is
tn-bit = td(cin → Sn-1) + (n-2)td(cin → cout) +td(a0,b0 → c1)
82
82
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
9. Write the equation for propagates term in CLA.
• In the case of binary addition, A + B propagates if and only if at least one of A or B is 1.
If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:Write the propagate term as, pi = ai bi
10. What are the two factors that Carry lookahead adder depends on?
• Carry lookahead depends on two things:
o Calculating, for each digit position, whether that position is going to propagate a
carry if one comes in from the right.
o Combining these calculated values to be able to deduce quickly whether, for each
group of digits, that group is going to propagate a carry that comes in from the
right.
14. Write the basic equation for Manchester Carry Chain Adder?
Define kill term, propagate and generate term in a carry look ahead adder. (April
2019)
• In this adder, the basic equation is ci +1 = g i + ci .pi
Where pi = ai bi and g i = ai .bi
• Carry kill bit ki = ai + bi = ai .bi
• If Ki=1, then pi=0 and gi=0. Hence, ki is known as carry kill bit.
15. Draw the switch level circuit for Manchester carry chain adder.
83
83
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
The switch level circuit is given as
84
84
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
22. What are multipliers?
• Multiplier is used in computation process, which multiplies two binary numbers.
• Basic operations in multiplication are given below.
0 x 0 = 0, 0 x 1 = 0, 1 x 0 = 0, 1x1=1
.
23. Draw the truth table of multiplier.
The truth table of multiplier is
85
85
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
29. Compare serial divider and parallel divider.
• Serial divider is slow and parallel divider is fast in performance. Array divider is fast
compared with the serial divider. But hardware requirement is increased.
30. What is shift register?
• An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or
right shifting.
• For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get y3 y 2 y 1 y 0 = a2 a1 a0 a3
If it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1
31. What is meant by Barrel shifter?
• A barrel shifter is a digital circuit that can shift a data word by a specified number of bits
in one clock cycle.
• It can be implemented as a sequence of multiplexers (MUX). The output of one MUX is
connected to the input of the next MUX in a way that depends on the shift distance.
32. Draw the structure of 4 X 4 barrel shifter. (April 2018)
33. What is the area constraint between carry lookahead adder and ripple carry adder?
• The area of a carry lookahead adder is larger than the area of a ripple carry adder.
• Carry lookahead adder are parallel, which requires a larger number of gates and also
results in a larger area.
34. What is the drawback of carry lookahead adder?
• In the carry lookahead adder, need large area because computations are in parallel and
more power is consumed.
35. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for 8 bit.
86
86
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
36. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for 16 bit.
37. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for32 bit.
39. Determine propagation delay of n-bit carry select adder. (May 2016)
✓ Propagation delay, P of n-bit carry select adder is equal to √2𝑁 where N = N- bit adder
40. Draw and list out the components of data path. (May 2017)
• Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.
87
87
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
42. What is latency? (Nov 2017)
• Clock latency (or clock insertion delay) is defined as the amount of time taken by the
clock signal in traveling from its source to the sinks.
88
88
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
56. Draw the schematic of dynamic edge –triggered register. (Dec. 2016)
57. Design a one transistor DRAM cell. (Nov 2013, April 2015)
Draw a 1-transistor Dynamic RAM cell. (April 2019) [Nov/Dec 2022]
89
89
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
61. Mention the different hardware architecture used for multiplier. (Nov 2019)
Hardware architectures for multipliers protected by (a) linear arithmetic codes, (b)
[jxj ; j2xj ] multilinear codes, and (c) multi-modulus multilinear codes.
62. Draw the dot diagram for Wallace tree multiplier. [May 2021]
64. State the need of a sense amplifier in a memory cell. (NOV 2021)[Apr/May 2022]
It senses the low power signals from a bitline that represents a data bit (1 or 0) stored in a
memory cell, and amplify the small voltage swing to recognizable logic levels so the data can be
interpreted properly.
90
90
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
• The simplest case arises when two one bit numbers are to be added.
• With one bit, only the numbers 0 and 1 can be represented.
• All possible scenarios can be summarized by the following table:
91
91