0% found this document useful (0 votes)
188 views91 pages

Unit 4-Vlsi and Chip Design

Vlsi 4

Uploaded by

afrideemohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
188 views91 pages

Unit 4-Vlsi and Chip Design

Vlsi 4

Uploaded by

afrideemohammed
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 91

UNIT-IV –EC3552 VLSI AND CHIP DESIGN

UNIT – IV
INTERCONNECT, MEMORY ARCHITECTURE AND ARITHMETICCIRCUITS

Interconnect Parameters – Capacitance, Resistance, and Inductance, Electrical Wire Models,


Sequential digital circuits: adders, multipliers, comparators, shift registers. Logic Implementation
using Programmable Devices (ROM, PLA, FPGA), Memory Architecture and Building Blocks,
Memory Core and Memory Peripherals Circuitry.

4.1. Design of Data path circuits:

Discuss about data path circuits.

 Data path circuits are meant for passing the data from one segment to other segment for
processing or storing.
 The datapath is the core of processors, where all computations are performed.
 It is generally defined with general digital processor. It is shown in figure.

Figure: General digital processor


 If only data path and its communication is shown as

 In this, data is applied at one port and data output is obtained at second port.

 Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.
 Datapaths are arranged in a bit sliced organization.
 Instead of operating on single bit digital signals, the data in a processor are arranged in a
word based fashion.
 Bit slices are either identical or resemble a similar structure for all bits.

1
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 The datapath consists of the number of bit slices (equal to the word length), each
operating on a single bit. Hence the term is bit-sliced.

Figure: Bit-sliced datapath organization


******************************************************************************

4.2. Ripple Carry Adder:

 Draw the structure of ripple carry adder and explain its operation. (Nov 2017)
 Explain the operation of a basic 4 bit adder. (Nov 2016)
 Realize a 1-bit adder using static CMOS logic. Optimize the Boolean expressions of
sum and carryout and realize a 1-bit adder using static CMOS logic. Also realize a 1-
bit adder using transmission gate. Compare all the three cases from hardware
perspective. (Nov 2019)

Architecture of Ripple Carry Adder:


 AOI Full adder circuit (AND OR INVERT)
 An AOI algorithm for static CMOS logic circuit can be obtained by using the equation.
Ci 1  ai bi  ci .(ai  bi )
Si  (ai  bi  ci )ci  (ai .bi .ci )

Figure: AOI Full adder


 If n bits are added, then we can get n-bit sum and carry of Cn.

2
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Ci= Carry bit from the previous column.
 N bit ripple carry adder needs n full adders with Ci+1 carry out bit.

Figure: Ripple carry adder


 The overall delay depends on the characteristics of full adder circuit. Different CMOS
implementation can produce different delay parts.
 tdi- worst case delay through the ith stage. We can calculate the total delay using the
following equation
t4b = td3+td2+td1+td0and td0 = td(a0,b0  c1)
 This is the time for the input to produce the carry out bit.
td1= td2 = td(cin  cout)
td3= td(cin  S3)
t4b = td(cin  S3) +2td(cin  cout) +td(a0,b0  c1)
 If it is extend to n-bit, then the worst case delay is
tn-bit = td(cin  Sn-1) + (n-2)td(cin  cout) +td(a0,b0  c1)
 Worst case delay linear with the number of bits
td = O(N)
tadder = (N-1)tcarry + tsum
 The figure below shows 4-bit adder/subtractor circuit.
 In this, if add/sub=0, then sum is a+b. If add/sub=1, then the output is a-b.

Figure:4-bit adder/subtractor circuit


 Sum and carry expressions are designed using static CMOS.
 It requires 28 transistors which lead large area and circuit is slow.
Sum, S= 𝐴𝐵𝐶𝑖 + 𝐶0̅ (𝐴 + 𝐵 + 𝐶𝑖 ) and Carry, C0= 𝐴𝐵 + 𝐵𝐶𝑖 + 𝐶𝑖 𝐴
Drawbacks:
 Circuit is slower.

3
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 In ripple carry adder, carry bit is calculated along with the sum bit. Each bit must wait for
calculation of previous carry.

VDD

VDD
Ci A B

A B
A

B
Ci B
VDD
A
X
Ci

Ci A S
Ci

A B B VDD
A B Ci A

Co B

Figure:Complimentary Static CMOS Full Adder


******************************************************************************
4.3. Carry Look Ahead Adder (CLA):

 Explain the operation and design of Carry lookahead adder (CLA). (May 2017, Nov
2016)[Apr/May 2022] [Nov/Dec 2022]
 How the drawback in ripple carry adder overcome by carry look ahead adder and
discuss. (Nov 2017)
 Explain the concept of carry lookahead adder and discuss its types. (April 2018)
 Derive the necessary expressions of a 4 bit carry look ahead adder and realize the carry
out expressions using dynamic CMOS logic. (April 2019-13M)

 A carry-lookahead adder (CLA) is a type of adder used in digital circuit.


 A carry-lookahead adder improves speed by reducing the amount of time required todetermine
carry bits.
 In ripple carry adder, carry bit is calculated alongwith the sum bit.
 Each bit must wait until the previous carry is calculated to begin calculating its own result and
carry bits.
 The carry-lookahead adder calculates one or more carry bits before the sum, which reduces
the wait time to calculate the result of the larger value bits.
 A ripple-carry adder works starting at the rightmost (LSB) digit position, the two
corresponding digits are added and a result obtained. There may be a carry out of this digit
position.
 Accordingly all digit positions other than LSB.Need to take into account the possibility to add
an extra 1, from a carry that has come in from the next position to the right.
 Carry lookahead depends on two things:
 Calculating, for each digit position, whether that position is going to propagate a carry if
one comes in from the right.
4

4
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 Combining these calculated values to be able to realize quickly whether, for each group
of digits, that group is going to propagate a carry.
 Theory of operation:
 Carry lookahead logic uses the concept of generating and propagating carry.
 The addition of two 1-digit inputs A and B is said to generate if the addition will carry,
regardless of whether there is an input carry.
 Generate:
 In binary addition, A + B generates if and only if both A and B are 1.
 If we write G(A,B) to represent the binary predicate that is true if and only if A + B
generates, we have:
G(A,B) = A . B
 Propagate:
 The addition of two 1-digit inputs A and B is said to propagate if the addition will carry
whenever there is an input carry.
 In binary addition, A + B propagates if and only if at least one of A or B is 1.
 If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:
P( A, B)  A  B
 These adders are used to overcome the latency which is introduced by the rippling effect of
carry bits.
 Write carry look-ahead expressions in terms of the generate gi and propagate pi signals. The
general form of carry signal ci thus becomes
ci 1  ai .bi  ci .(ai  bi )  gi  ci .pi
 If ai .b =1, then ci 1  1, write generate term as, gi  ai .bi
 Write the propagate term as, pi  ai  bi
 Sum and carry expression are written as,
Si = ai  bi
c1=g0+p0.c0
c2=g1+p1.c1= g1+p1.(g0+p0.c0)
c3=g2+p2.c2
c4=g3+p3.c3 =g3+p3.g2+ p3.p2.g1+ p3.p2.p1.g0 + p3.p2.p1.p0.c0

5
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure:Symbol and truth table of generate & propagate

Figure – Logic network for 4-bit CLA carry bits

Figure – Sum calculation using the CLA network


 The symmetry in the array is shown in mirror. It allows more structured layout at the
physical design level.

6
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure – MODL carry circuit


 MODL-Multiple Output Domino Logic.
 MODL is non-inverting logic family and is a dynamic circuit technique.
 Its limitations are
i. Clocking in mandatory
ii. The output is subject to charge leakage and charge sharing.
iii. Series connected nFET chains can give long discharge times.

******************************************************************************

4.4. Manchester Carry Chain Adder:

Discuss about Manchester Carry Chain Adder.


7

7
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

 The Manchester carry chain is a variation of the carry-lookaheadadder that uses shared logic
to lower the transistor count.
 A Manchester carry chain generates the intermediate carries by tapping off nodes in the gate
that calculates the most significant carry value.
 Dynamic logic can support shared logic, as transmission gate logic.
 One of the major drawbacksof the Manchester carry chain is increase the propagation delay.
 A Manchester-carry-chain section generally won't exceed 4 bits.
 In this adder, the basic equation is ci 1  gi  ci .pi
Where pi  ai  bi and gi  ai .bi
 Carry kill bit ki  ai  bi = ai .bi
 If Ki=1, then pi=0 and gi=0. Hence, ki is known as carry kill bit.

Table

Figure – switch level circuit


 In the circuit shown below Cl is used as an input if Pi = 0, then M3 is ON, M4 is OFF.
 If gi=0, then M1 is ON, M2 is ON
 If gi=1, then M2 is OFF, M4 is ON and output equal to zero.
 If Pi=1, then this case is a complicated one.

8
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

 In dynamic circuit figure


 If ϕ = 0, then recharge occur and output is 1
 If ϕ = 1, then evaluation occur.

Figure dynamic circuit


 Dynamic Manchester carry chain for the carry bit upto C4 is shown below. C1, C2, C3, C4
can be taken by using inverters. The carry input is given as C0

******************************************************************************
4.4.1. HIGH SPEED ADDERS:

 Discuss about different types of high speed adders. (Apr. 2016)


 Describe the different approaches of improving the speed of the adder. (Nov 2016)

(i) Carry Skip(bypass) Adder:


9

9
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Design a carry bypass adder and discuss its features. (May 2016)
Explain the carry-propagate adder and show how the generation and propagation
signals are framed. [May 2021]
 It is high speed adder. It consist of adder, AND gate and OR gate.
 An incoming carry Ci,0=1 propagates through the complete adder chain and an outgoing
carry C0,3=1.
 In other words, if (P0P1P2P3 =1) then C0,3= Ci,0 else either DELETE or GENERATE
occurred.
 It can be used to speed up the operation of the adder, as shown in below fig (b).

Figure: Carry Skip Adder.


 When BP= P0P1P2P3 =1, the incoming carry is forwarded immediately to the next block.
 Hence the name carry bypass adder or carry skip adder.
 Idea: if (P0 and P1 and P2 and P3 =1) the C03 = C0, else “kill” or “generate”.

Figure: (a) Carry propagation (b) Adding a bypass


 The below figure shows n no. of bits carry skip adder.

10

10
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Bit 0–3 Bit 4–7 Bit 8–11 Bit 12–15
Setup tsetup Setup Setup Setup
tbypass

Carry Carry Carry Carry


propagation propagation propagation propagation

Sum Sum Sum tsum Sum

M bits

tadder = tsetup+Mtcarry+(N/M-1)tbypass +(M-1) tcarry +tsum(worst case)


tsetup: overhead time to create G, P, D signals

Figure: Manchester carry-chain implementation of bypass adder

(ii) Carry Select Adder:

Design a carry select adder and discuss its features. (May 2016)

 A carry-select adder is a particular way to implement an adder, which is a logic element


that computes the (n+1)-bit sum of two n-bit numbers.
 The carry-select adder is simple but rather fast, having a gate level depth of O( n ).
 The carry-select adder generally consists of two ripple carry adders and a multiplexer.
 Adding two n-bit numbers with a carry-select adder is done with two adders in order to
perform the calculation twice.
 One time with the assumption of the carry-in being zero and the other assuming it will be
one.
 After the two results are calculated (the correct sum as well as the correct carry-out),it is
then selected with the multiplexer once the correct carry-in is known.
 The number of bits in each carry select block can be uniform, or variable.
 In the uniform case, the optimal delay occurs for a block size of n .
 The O( n ) delay is derived from uniform sizing, where the ideal number of full-adder
elements per block is equal to the square root of the number of bits being added.
 Propagation delay, P is equal to√2𝑁 where N = N- bit adder
 Below is the basic building block of a carry-select adder, where the block size is 4.

11

11
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 Two 4-bit ripple carry adders are multiplexed together, where the resulting carry and sum
bits are selected by the carry-in.

Figure: Building blocks of a carry-select adder


Uniform-sized adder:
 A 16-bit carry-select adder with a uniform block size of 4 can be created with three of
these blocks and a 4-bit ripple carry adder.
 Since carry-in is known at the beginning of computation, a carry select block is not
needed for the first four bits.
 The delay of this adder will be four full adder delays, plus three MUX delays.
 tadder = tsetup + Mtcarry + (N/M)tmux + tsum

12

12
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: general structure of 16 bit adder


Disadvantage: hardware cost is increased.

(iii) Carry Save Adder:


 Carry save adder is similar to the full adder. It is used when adding multiple numbers.
 All the bits of a carry save adder work in parallel.
 In carry save adder, the carry does not propagate. So, it is faster than carry propagate adder.
 It has three inputs and produces 2 outputs, carry-out is saved. It is not immediately used to
find the final sum value.

13

13
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Carry Save Adder


******************************************************************************
4.5. ALUs (ACCUMULATOR):
Briefly discuss about ALUs (accumulators).

 Accumulator acts as a part of ALU and it is identified as register A. The result of an


operation performed in the ALU is stored in the accumulator.
 It is used to hold the data for manipulation (arithmetic and logical)
 Arithmetic functions are very important in VLSI. Ex: multiplication.
 Half adder circuit has two inputs and two outputs. S = x  y , C = x.y.

Figure: Half adder

 Full adder circuit has three inputs and two outputs

14

14
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure : Full adder and truth table


CPL --- Complementary Pass Logic

Figure : CPL Full adder design


******************************************************************************

4.6. MULTIPLIERS:

 Explain the design and operation of 4 x 4 multiplier circuit. (Apr. 2016, 2017, Nov 2016, 2018)
 Design a multiplier for 5 bit by 3 bit. Explain its operation and summarize the numbers of
adders. Discuss it over Wallace multiplier. (Nov 2017, April 2018)
 Design a 4 bit unsigned array multiplier and analyze its hardware complexity. (April 2019-
13M) (Nov 2019)
 Describe the hardware architecture of a 4-bit signed array multiplier. [Nov/Dec 2022]

15

15
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 A study of computer arithmetic processes will reveal that the most common requirements
are for addition and subtraction.
 There is also a significant need for a multiplication capability.
 Basic operations in multiplication are given below.
0x0=0, 0x1=0, 1x0=0, 1x1=1

1 0 1 0 1 0 Multiplicand
x 1 0 1 1 Multiplier
1 0 1 0 1 0
1 0 1 0 1 0

0 0 0 0 0 0 Partial products

+ 1 0 1 0 1 0

1 1 1 0 0 1 1 1 0 Result

 If two different 4-bit numbers (x0, x1, x2, x3& y0, y1, y2, y3)are multiplied then

16

16
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Multiplication by shifting:
 If x=(0010)2 = (2)10
 If it is to be multiplied by 2, then we can shift x in left side. x = (0100)2 = (4)10
 If it is to be divided by 2, then we can shift in right side. x = (0001)2 = (1)10.
 So, shift register can be used for multiplication or division by 2.

 A practical implementation is based on the sequence. The product is obtained by


successive addition and shift right operations

(i) Array multiplier:

Figure: General block diagram of multiplier


 Array multiplier uses an array of cells for calculation.
 Multiplier circuit is based on repeated addition and shifting procedure. Each partial
product is generated by the multiplication of the multiplicand with one multiplier digit.
 The partial products are shifted according to their bit sequences and then added.
 N-1 adders are required where N is the number of multiplier bits.
 The method is simple but the delay is high and consumes large area by using ripple carry
adder for array multiplier.Product expression is given below

17

17
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: 4 x 4 array multiplier


 This multiplier can accept all the inputs at the same time. An array multiplier for n-bit
word need n(n-2) full adders, n-half adder and n2 AND gates.
X3 X2 X1 X0 Y0

X3 X2 X1 X0 Y1 Z0

HA FA FA HA

X3 X2 X1 X0 Y2 Z1

FA FA FA HA

X3 X2 X1 X0 Y3 Z2

FA FA FA HA

Z7 Z6 Z5 Z4 Z3
Figure: 4 x 4 array multiplier using Fulladder, Halfadder and AND gate.

18

18
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(iv) Booth (encoding) multiplier:
 Booth’s algorithm is an efficient hardware implementation of a digital circuit that
multiplies two binary numbers in two’s complement notation.
 Booth multiplication is a fastest technique that allows for smaller, faster multiplication
circuits, by recoding the numbers that are multiplied.
 The Booths multipliers widely used in ASIC oriented products due to the higher
computing speed and smaller area.
 In the binary number system, the digits called bits are to the set of {0,1}.
 The result of multiplying any binary number by binary bit is either 0 or original number.
 This makes the formation of partial products are more efficient and simple.
 Then adding all these partial products is time consuming task for any binary multipliers.
 The entire process consists of three steps partial product generation, partial product
reduction and addition of partial products as shown in figure.

Figure: Block diagram of Booth multiplier

 But in booth multiplication, partial product generation is done based on recoding scheme
e.g. radix 2 encoding.
 Bits of multiplicand (Y) are grouped from left to right and corresponding operation on
multiplier (X) is done in order to generate the partial product.
 In radix-2 booth multiplication partial product generation is done based on encoding
which is as given by Table.

Table: Booth encoding table with RADIX-2

19

19
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 RADIX-2 PROCEDURE:
1) Add 0 to the LSB of the multiplier and make the pairing of 2 from the right to the left
which shown in the figure.

Figure: 2- Bit pairing as per Booth recoding using Radix- 2.


2) 00 and 11: do nothing according to the encoding table.
3)01: mark shows the end of the string’ of 1and add multiplicand to the partial product.
4)10: mark shows beginnings of the string of 1 subtract multiplicand from partial
product.

With suitable example and with detailed steps explain Radix-4 modified booth encoding for
an 8-bit signed multiplier. (Nov 2019)

Modified Booth Multiplier using Radix -4:


 The disadvantage of Booth Multiplier with Radix-2 is increasing partial products.
 Modified Booth Multiplier with Radix-4 is reducing the half of the partial products
in multipliers.
 Modified Booth multiplication is a technique that allows for smaller, faster circuits by
recoding the numbers that are multiplied.
 In Radix-4, encoding the multiplicands based on multipliers bits. It will compare 3-bits at
a time with overlapping technique.
 Grouping starts from the LSB and the first block contains only two bits of the multipliers
and it assumes zero for the third bit.

Figure. Grouping of 3-bit as per booth recoding

 These group of binary digits are according to the Modified Booth Encoding Table and it
is one of the numbers from the set of (-2,2,0,1,-1).

20

20
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Table: Booth encoding table with RADIX-4

21

21
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

 RADIX-4 PROCEDURE: [May 2021 (Model)]


1)Add 0 to the right of the LSB of the multiplier.
2)Extend the sign bit 1 position if it is necessary when n is even.
3)Value of each vector,the partial product is coming from the set of (-2,2,0,1,-1).

(v) Wallace tree Multiplier:

 A Wallace tree is an efficient hardware implementation of a digital circuit that multiplies


two integer numbers.
 The Wallace tree multiplier has three steps to be followed,
(a) Multiply each bit of one of the arguments, by each bit of the other, yielding n2 results.
(b) Reduce the number of partial products to two by layers of full and half adders.
(c) Group the wires in two numbers and add them with a conventional adder.
 The second section works as follows,
(a) Take any three wires with same weights and input them into a full adder. The result
will be an output wire of the same weight and an output wire with a higher weight for
each three input wires.

22

22
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(b) If there are two wires of the same weight left, input them into a half adder.
(c) If there is just one wire left and connects it to next layer.
 The Wallace tree multiplier output structure is tree basis style. It reduces the number of
components and reduces the area.
 The architecture of a 4 x4 Wallace tree multiplier is shown in figure.

 Apply radix-2 booth encoding to realize a 4-bit signed multiplier for (-10)*(-11).
(April 2019-15M) [Apr/May 2022][Nov/Dec 2022]
Solution:
M= -10 =0110, Q= -11 =0101
A Q Q-1
Step-I: 0000 0101 0 :last 2 bits are10; A=A-M
1010 0101 0 : shift right
1101 0010 1
Step-II: 0011 0010 1 :last 2 bits are 01; A=A+M
0001 1001 0 :shift right
Step-III: 1011 1001 0 :last 2 bits are10; A=A-M
1101 1100 1 ;shift right
Step-IV: 1101 1100 1 ;last 2 bits are 01; A=A+M
0110 1110 0 ;shift right

****************************************************************************
4.7. DIVIDERS

Explain in detail about the design and procedure for dividers.

 There are two types of dividers, Serial divider and Parallel divider. Serial divider is slow
and parallel divider is fast in performance.
23

23
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 Generally division is done by repeated subtraction. If 10/3 is to be performed then,
10 -3 =7, ( divisor is 3, dividend is 10)
7 – 3 = 4,
4–3=1
 Here, repeated subtraction has been done, after 3 subtractions, the remainder is 1. It is
less than divisor. So now the subtraction is stopped.
 Let see the example of binary division with use of 1’s complement method
1010 (10d) / 0011 (3d)
Step1: find 1’s complement of divisor
Step2: add this with the dividend
Step3: if carry is 1, then it is added with the output to get the difference output
Step4: the same procedure is repeated until we are get carry 0.
Step5: then the process is stopped.
1 0 1 0 (10)

24

24
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

 Basic building blocks of serial adder are given below.


1. 4 bit adder
2. 4 bit binary up counter
3. 2:1 MUX (4 MUXs are used)
4. D flipflop

 Y0 Y1 Y2 Y3 are complemented and given to 4 bit adder block (figure shown below)
 X0 X1 X2 X3 are given to MUXs and MUX output is given to D flipflop. Select signal of
MUX is high. It is connected to clear input of counter.
 Carry output of adder is connected with clock enable pin of counter. The same is given to
OR gate. The output of this OR gate is given to clock enable signal of flipflops.
 The other input of OR gate is tied with select signal of MUX.
 If X > Y, C0 of adder is high.
 After first subtraction, the counter output is incremented by 1.
 For each subtraction, the counter output is incremented.
 If C0 of adder is low, then clock of counter and FF is disabled. Counting is stopped.
 Q3 Q2 Q1 Q0 is the counter output (Quotient)
 R3 R2 R1 R0 is the flipflop output (remainder)

25

25
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

******************************************************************************
4.8. SHIFT REGISTERS:

Design 4 input and 4 output barrel shifter using NMOS logic. (NOV 2018, Nov 2019).
List the several commonly used shifters. Design the shifter that can perform all the
commonly used shifters. [May 2021, NOV 2021]
Elaborate in detail the design of a 4-bit barrel shifter. [Nov/Dec 2022]

 An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or right
shifting.

 For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get y3 y 2 y 1 y 0 = a2 a1 a0a3
If it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1

Barrel Shifter:
 A barrel shifter is a digital circuit that can shift a data word by a specified number of bits in
oneclock cycle.
26

26
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 It can be implemented as a sequence of multiplexers(MUX), and in such an implementation
the output of one MUX is connected to the input of the next MUX in a way that depends on
the shift distance.
 For example, take a four-bit barrel shifter, with inputs A, B, C and D. The shifter can cycle the
order of the bits ABCD as DABC, CDAB, or BCDA; in this case, no bits are lost.
 That is, it can shift all of the outputs up to three positions to the right (thus make any cyclic
combination of A, B, C and D).
 The barrel shifter has a variety of applications, including being a useful component in
microprocessors (alongside the ALU).

Figure: 8 X 4 barrel shifter


 General symbol for barrel shifter is shown in figure. The outputs are given as y3 y 2 y 1 y 0. S0,
S1, S2,S3 are known as shift lines.
 A barrel shifter is often implemented as a cascade of parallel 2×1 multiplexers.
 For a 8-bit barrel shifter, two intermediate signals are used which shifts by four and two bits,
or passes the same data, based on the value of S[2] and S[1].
 This signal is then shifted by another multiplexer, which is controlled by S[0].
 A common usage of a barrel shifter is in the hardware implementation of floating-point
arithmetic.

Figure: Barrel Shifter

27

27
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 For a floating-point add or subtract operation, requires shifting the smaller number to the
right, increasing its exponent, until it matches the exponent of the larger number.
 This is done by using the barrel shifter to shift the smaller number to the right by the
difference, in one cycle.
 If a simple shifter were used, shifting by n bit positions would require n clock cycles.
 The disadvantages of FET array barrel shifter are the threshold voltage drop problem, parasitic
limited switching time problem.
 The figure shown is known as a barrel shifter and a 8 x 4-bit barrel shifter circuit.

Logarithmic Shifter:
 A Shifter with a maximum shift width of M consists of a log 2M stages, where the ith stage
either shifts over 2i or passes the data unchanged.
 Maximum shift value of seven bits is shown in figure, to shift over five bits, the first stage is
set to shift mode, the second to pass mode and the last again to shift.
 The speed of the logarithmic shifter depends on the shift width in a logarithmic wa, M-bit
shifter requires log2M stages.
 The series connection of pass transistors slows the shifter down for larger shift values.
 Advantage of logarithmic shifter is more effective for larger shift values in terms of both area
and speed.

******************************************************************************
4.9. SPEED AND AREA TRADE OFF:
Discuss the details about speed and area trade off. (May 2017)
Discuss trade-off between speed Vs area. [Nov/Dec 2022]

Adder:
 The tradeoff in terms of power and performance is shown below.
 The performance is represented in terms of the delay(speed).
 The area estimations for each of the delays are given based on the fact that area is in
relation to the power consumption.
 The area of a carry lookahead adder is larger than the area of a ripple carry for a
particular delay.

28

28
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 This is because the computations performed in a carry lookahead adder are parallel,
which requires a larger number of gates and also results in a larger area.
CLA –Carry Lookahead Adder, RC, R – Ripple carry adder

Figure: Area Vs Delay for 8 bit adder Figure: Area Vs Delay for 16 bit adder

Figure: Area Vs delay for 32 bit adder Figure: Area Vs delay for 64 bit adder

Figure: Delay Vs Area for all adders Figure: Area Vs Delay for all multiplier
 Above figures shows that the delay of the ripple carry adder increases much faster when
compared to the carry lookahead adder as the number of bits is increased.
 In the carry lookahead adder, the cost is in terms of the area because computations are in
parallel, and therefore more power is consumed for a specific delay.

29

29
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

4.10 Memory Architecture and Memory Control Circuits:

Discuss Memory classification and its architecture and building blocks.

4.10.1 Memory Classification:


 Parameters used to characterize a memory device are area, power and speed.
 Area: area is important for its physical implementations by VLSI technology. Smaller
the area per bit and more devices can be accommodated. So cost per bit is reduced.
 Speed: speed of operation plays a very important role. Memory can communicate at
speed with processors.
 Power: Power is important, because MOS memories are used in many battery operated
portable systems. Power dissipation of memory plays an important role. Memory devices
will consume less power.
Classification based on operation mode:
1. ROM
2. RAM
Classification based on data storage mode:
It means on how it is stored and how long it remains there.
1. Volatile
2. Non-volatile
• Volatile memory devices will store information, as long as power is it.
• As soon as power is turned off, information is lost. Static RAM and dynamic RAM belong to
the category of volatile memory.
• EPROM and mask programmable ROM are non-volatile memory devices.
• If the power is turned off information will not be lost.

Classification based on access method:


1. Random access
2. Non-random access
4.10.2 Memory Architecture and Building Blocks:
Explain the memory architecture and its control circuits in detail. (April 2018)
Illustrate the building blocks of Memory architectures and memory peripheral circuitry
adapted to operate for non-volatile memory. [May 2021]
When n x m memory is implemented, then, n memory words are arranged in a linear fashion.
One word will be selected at a time by using select line.
• If we want to implement the memory 8X8, n=8, m=8(number of bits).
• Then we need 8 select signals (one for each word).
• But by using decoder we can reduce the number of select signals.
• In case of 3 to 8 decoder, if 3 inputs are given to decoder, then we can get 8 select
signals.
• If n=220, then we can give only 20 inputs to the decoder.

30

30
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Array structured memory organization

 If basic storage cell size is approximately square, then the design is extremely slow. The
vertical wire, which connects the storage cells to I/O will be excessively large.
 So, memory arrays are organized in such a way that vertical and horizontal dimensions
are the same.
 The words are stored in a row. These words are selected simultaneously.
 The column decoder is used to route the correct word to the I/O terminals.
 The row address is used to select one row of memory and column address is used to
select particular word from that selected row.
 Word line: The horizontal select line which is used to select the single row of cell is
known as word line.
 Bit line: The wire which connects the cell in a single column to the input/output circuit is
known as bit line.
 Sense amplifier: It requiresanamplificationoftheinternalswingtofullrail-to-rail
amplitude.
 Block address: the memory is divided into various small blocks.
 The address which is used to select one of the small blocks to be read or written is known
as block address.
 Advantages:
1. Access time is fast
31

31
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
2. Power saving is good, because blocks not activated are in power saving mode.

Figure: Hierarchical memory architecture

4.10.3 Memory Core


Discuss Memory core its types in detail.

4.10.3.1 Read Only Memory (ROM):


ROM is a memory where code is written only one time.
Diode ROM:
 It is simple where presence of diode in between bit line and word line is considered as logic 1
and absence of diode as logic 0.
 Disadvantage is used for small memories and no isolation between word line and bit line.

Figure: Diode ROM


MOS ROM:
Q: Draw the NOR and NAND implementation of 4-word, 4-bit ROM. (NOV 2021)
 Diode is replaced by gate source connection of nMOS. Drain is connected to VDD.
 The charging and discharging of word line capacitance has been taken care by the word line
driver.
 Absence of a transistor between word line and bit line means logic 1 is stored and if presence
then logic 0 is stored.

32

32
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: 4 x 4OR ROM cell array Figure: 4 x 4 MOS NOR ROM

Programming ROM
 The transistor in the intersection of row and column is OFF when the associated word line is
LOW. In this condition, we get logic 1 output.

Figure: 4 x 4 MOS NAND ROM


Advantage: basic cell only consists of transistor. No need of connection to any of the supply
voltage.
Disadvantage: As it has pseudo nMOS, it is ratioed logic and consumes static power.
To overcome this, precharged MOS NOR ROM logic circuit is used.
 This eliminate static dissipation ratioed logic requirement.

4.10.3.2 Non-Volatile READ-WRITE Memory:


 It consists of array of transistors. We can write the program by enabling or disabling these
devices selectively.
 To reprogram, the programmed values to be erased, then the new programming is started.

Floating gate transistor:


 It is mostly used in all the reprogrammable memories.
33

33
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 In floating gate transistor, extra polysilicon strip is used in between the gate and the channel
known as floating gate.
 Floating gate doubles the gate oxides thickness and hence device transconductance is reduced
and threshold voltage is increased.
 The threshold voltage is a programmable.
 If high voltage is (>10V) is applied between the source terminals and gate-drain terminals,
then high electric field is generated. So, avalanche injection occurs.
 After acquiring energy, electron becomes hot and transverse through the first oxide insulator .
They get trapped on the floated gate.
 The floating gate transistor is known as floating gate avalanche injection MOS or FAMOS.
Disadvantage: High programming voltage is need.

Figure (a) floating gate transistor (b) symbol


EPROM – Erasable Programmable Read Only Memory:
 Erasing is done by passing UV rays on the cell by using transparent window.
 This process will take some seconds to some minutes.
 It depends on intensity of UV source. The programming takes 5-10microseconds/word.
 During programming, chip is removed from the board and placed in EPROM programmer.
Advantages: simple and large families are fabricated with low cost.
Disadvantages:
 Number of erase/program cycle is limited upto 1000.
 Reliability is not good.
 Threshold voltage of the device may be varied with repeated program.

EEPROM – E2PROM:
 Electrically Erasable Programmable ROM. Here Floating gate tunneling oxide (FLOTOX) is
used.
 It is similar to floating gate except that the portion of the floating gate is separated from the
channel at the thickness of 10nm or <10nm.
 If 10V is applied, electron travels to and from the floating gate through Fowler-Nordheim
tunneling.
 Erasing can be done by revering applied voltage which is used for writing.

34

34
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: FLOTOX transistor


Advantage: High versatility and possible for 105 erase/write cycle.
Disadvantages: Larger than FAMOS transistor, Costly, Repeated programming causes a drift in
threshold voltage.

Flash Memory – Flash Electrically Erasable Programmable ROM


 It is a combination of density of EPROM and versatility of EEPROM.
 Avalanche hot electron injection mechanism is used.
 Erasing can be done by Fowler-Nordheim tunneling concept. Here erasing is done in bulk.

Figure: ETOX device


 It is similar to FAMOS gate.
 A very thin tunneling oxide layer (10nm thickness) is there.
 Erasing operation: Erasing can be performed when gate is connected to the ground and the
source is connected to 12V.
 Write operation: High voltage pulse is applied to the gate of the selected device. Logic 1 is
applied to the drain and hot electrons are injected into the floating gate.
 Read operation:To select a cell, its word line is connected to 5V. It causes conditional
discharge of the bit line.

Figure: (a) Erase (b) Write (c) Read operation of NOR flash memory

35

35
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
4.10.3.3 RAM – Random Access Memory
Explain about static and dynamic RAM.
Construct 6T based SRAM cell. Explain its read and write operations. (NOV 2018)
[Nov/Dec 2022]

4.10.3.3.1Static RAM:
 SRAM cell needs 6 transistors per bit.
 M5 and M6 transistors are shared between read and write operations.
 Bit line(BL) and inverse Bit Line signals are used to improve the noise margin during read
and write operations.
Read operation:
 Let us assume logic 1 is stored at Q and BL and inverse BL are precharge to 2.5V before
starting read operation.
 The read cycle is started by asserting word line then M5 and M6 transistors are enabled.
 After the small initial word line delay then the values stored at Q and inverse Q are transferred
to the bit lines by leaving BL at 2.5V and the value at inverse Q is discharge through M1, M5.

Figure: CMOS SRAM cell


Write operation:
 Assume that Q=1, now logical 0 is to be written in the cell.
 Then inverse BL is set to 1 and BL is set to 0.
 The gate of M1 is at VDD and gate of M4 is at ground as long as the switching is not
commenced.
 Inverse Q is not pulled high enough to ensure the writing of logic 1.
 Cell voltage is kept below 0.4V. The new value of the cell is written through M6.
4.10.3.3.2Dynamic RAM:
Three transistors DRAM
 Content in the cell can be periodically rewritten through a resistive load, called as refresh
operation.
 This refresh occurs for every 1-4ms. Dynamic memory has refresh operation.
 For example, logic 1 is to be written, and then BL1 is asserted high and write wordline(WWL)
is asserted.
 This data is retained as charge on the capacitor once WWL is low.
36

36
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Three transistor dynamic memory cell


 To read the cell, the read word line (RWL) is raised. M2 transistor is either ON or OFF
depends upon the stored value.
 BL2 bit line is connected to VDD or it is precharged to VDD or VDD-Vt.
 When logic 1 is stored, the series combination of M2and M3 pulls BL2 line low.
 If logic 0 is stored, then BL2 line is high.
 To refresh the cell, first the stored data is read, and its inverse is placed on BL1 and WWL
line is asserted.
One transistor DRAM:
 In this cell, to write logic 1 then it is placed on bit line and word line is asserted high.
 The capacitor is charged or discharged depending upon the data. Before performing read
operation, bit line is precharged.

Figure: One transistor DRAM

4.10.3.3.3 CAM – Content Addressable or Associate Memory

Explain about CAM.

 It supports 3 operating modes,


 Read
 Write
 Match
 In this memory, it is possible to compare all the stored data in parallel with the incoming
data. It is not power efficient.
 Figure shows apossible implementation ofaCAM array.
 Thecellcombines atraditional 6TRAM storage cell (M4-M9)withadditional circuitry
toperform al-bit digital comparison (M1-M3).
 When thecellistobe written,complementarydataisforced ontothebitlines, whiletheword
lineisenabled a s inastandardSRAM cell.

37

37
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 In the compare mode, stored data are compared using bit line. The match line is
connected to all CAM blocks in a row. And it is initially precharged to V DD.
 If there is some match occurs, then internal row is discharged. If even one bit in a row is
mismatched, then the match line is low.

Figure: CAM cell

*****************************************************************************

4.11 Memory peripheral (control) Circuits:


Explain the memory architecture and its control circuits in detail. (April 2018)
Illustrate the building blocks of Memory architectures and memory peripheral circuitry
adapted to operate for non-volatile memory. [May 2021]

(i) Address & Block Decoders:


Row Decoder:
 Row and column address decoder are used to select the particular memory location in an
array.
 Row decoder is used to drive NOR ROM array. It selects one of 2 n word lines.
 Dynamic 2 to 4 decoder reduces the number of transistors and propagation delay.

Symbol and Truth table Dynamic2-to-4NORdecoder

Column Decoder
 It should matchthebitlinepitchofthememory array.

38

38
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 In column decoder, decoder outputs are connected to nMOS pass transistors.
 By using this circuit, we can selectively drive one out of m pass transistors.
 Only one nMOS pass transistor is ON at the time.

Figure: Four-input pass-transistor-based column decoder using a NOR predecoder

(ii) Sense Amplifier


 Sense amplifiers play a major role in the functionality, performance and reliability
of memory circuits.
 Basic differential sense amplifier circuit shown in below figure.
 It performs the following performances

Amplification:
 In memory structures such as the 1TDRAM, amplification is required for proper
functionality.
Delay Reduction:

The amplifier compensates for the fan-out driving capability of the memory cell by
detecting and amplifying small transitions on the bit line to large signal output
swings.
Power reduction:

Reducing the signal swing on the bit lines can eliminate large part of the power
dissipation related to charging a nd discharging the bit lines.
(iii) Drivers/ Buffers
 The length of word and bit lines increases with increasing memory sizes.
 Large portion o f the read and write access time can be attributed t o the
wire delays.
 A major part of the memory-periphery area is allocated to the drivers (address
buffers and I/O drivers).
******************************************************************************

39

39
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
4.12: Low Power Memory design:
 Discuss about Low power memory design. [Apr/May 2022]
 Elucidate in detail low power SRAM circuit. (April 2019-13M) (Nov 2019)

(i) Active Power Reduction:


 Voltage reduction done by either an increase in the size of the storage capacitor
and/or a noise reduction.
Techniques for power reductions:
• Half- VDD precharge:
 Precharging a bit line to VDD/2. It helps to reduce the active power dissipation in
DRAM memories by a factor of 2.
• Boosted word line:
 Raising the value of the word line above VDD during a write operation, eliminates
the threshold drop over the access transistor, yielding a substantial increase in
stored charge.
• Increased capacitor area or value:
 Keeping the "ground" plate of the storage capacitor at V DD/2 reduces the
maximum voltage over Cs, making it possible to use thinner oxides.
• Increasing the cell size:
 Ultra-low-voltage DRAM memory operation might requirea sacrifice in area
efficiency.
(ii) Retention current Reduction:
 SRAM array should not have any static power dissipation. But the leakage current of the
transistor will be the major problem and this is the main source of the retention current.
 This retention current can be reduced by the following factors.
1. Turn OFF unused memory blocks
2. Negative biasing voltage of the cells which are not active, thus reduce the leakage current.
3. If low threshold voltage transistor is inserted between V DD and SRAM array, leakage
reduces.
4. Leakage is a function of VDD, thus if supply rail is lowered, then leakage current is
reduced.

Figure: (a) Insertion of low threshold device (b) Reducing supply Voltage
******************************************************************************
40

40
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Programmable devices (Programmable ASIC):
 Programmable devices can be divided into three areas
1. Programmable logic structure
2. Programmable interconnect
3. Reprogrammable Gate array
 A programmable logic device (PLD) is an electronic component used to build
reconfigurable digital circuits.
 Unlike a logic gate, which has a fixed function, a PLD has an undefined function at the
time of manufacture.
1. Programmable Logic Structure:
 Describe in detail the chip with programmable logic structures. (Nov 2009)
(a) Programmable Logic Array:
 Programmable logic arrays (PLAs) is a type of fixed architecture logic devices with
programmable AND gates followed by programmable OR array.
 Logic array is the structure unit which can be programmed to perform various functions.
 Programmable Logic Array (PLA) can be implemented as AND-OR plane devices.
 Structure of AND-OR PLA is shown below.

Figure: Programmable logic array


 PLA is used to implement a complex combinational circuit.
 The AND and OR gates inside the PLA are initially fabricated with fuses among them.
 The specific Boolean functions are implemented in sum of products (SOP) form by
blowing appropriate fuses and leaving the desired connections.
 For an example, the Boolean expressions are,

41

41
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: PLA with three inputs, four product terms and two outputs
(b) PAL (Programmable Array Logic) Architecture:
 The PAL is a programmable logic device with a fixed OR array and a programmable
AND array.

Figure: Programmable Array Logic


 Because only the AND gates are programmable, the PAL is easier to program than
but is not as flexible as the PLA.
 The PAL is a programmable logic device with a fixed OR array and a programmable
AND array.

42

42
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Example of PAL circuit


Reprogrammable Gate array:
 A field programmable gate array (FPGA) is a VLSI circuit that can be programmed
at the user’s location.
 A typical FPGA consists of an array of millions of logic blocks, surrounded by
programmable input and output blocks and connected together via programmable
interconnections.
 There is a wide variety of internal configurations within this group of devices.
 The performance of each type of device depends on the circuit contained in its logic
blocks and the efficiency of its programmed interconnections.

43

43
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Programmable-logic-device approaches: (a) CPLD (b) FPGA.


 A typical FPGA logic block consists of lookup tables, multiplexers, gates, and flip-
flops.
 A lookup table is a truth table stored in an SRAM and provides the combinational
circuit functions for the logic block.
 The combinational logic section, along with a number of programmable
multiplexers, is used to configure the input equations for the flip-flop and the
output of the logic block.
 The advantage of using RAM instead of ROM to store the truth table is that the table
can be programmed by writing into memory.
 The disadvantage is that the memory is volatile and presents the need for the lookup
table’s content to be reloaded in the event that power is disrupted.
 The program can be downloaded either from a host computer or from an onboard
PROM.
 The program remains in SRAM until the FPGA is reprogrammed or the power is
turned off. The device must be reprogrammed every time power is turned on.
******************************************************************************

Programming technology used in FPGA :

Discuss the different types of programming technology used in FPGA design. (NOV 2016)

 There are three types of programming technology.


 Fusible link programming (Anti fuse)
 SRAM Programming
 EPROM and EEPROM programming
5.6.1: Fusible link programming:
 In this type, Platinum, Titanium tungsten is used to form link.
 It is blown when certain current is exceeded in the fuse. Higher voltage is applied to
device to blow the fuses.

Draw and explain the operation of metal-metal antifuse and EPROM transistor. (June 2012)

44

44
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

ANTIFUSE:
 In FPGA, the device is programmed by changing the characteristic of switching element
(or) we can write the program for routing.
 Programming routing can be explained by using the product of ACTEL, Quick Logic
Companies etc.
 In ACTEL, interconnect is done by PLICE (or) Antifuse.
 PLICE means Programmable Low Impedance Circuit Element.
 Antifuse is high resistance (>100MΩ) is changed into low resistance (200-500Ω) by
applying programming voltage.
 It consists of ONO (Oxide-Nitride-Oxide) layer which is sandwiched between
polysilicon layer and n+ diffusion.
 Antifuses separate interconnect wires on the FPGA chip and the programmer blows an
antifuse to make a permanent connection.
 Once an antifuse is programmed, the process can’t be reversed. This is an OTP
Technology.

In-system programming (ISP):


 Possibility to program the chip after it has been assembled on the PCB.
 In Quick logic company, programmable interconnect is provided with Vialink (metal-
metal anti-fuse).

45

45
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Figure: Metal-metal anti-fuse
Advantages of metal-metal antifuse:
 Advantages of metal-metal antifuse over poly diffusion antifuse are:
1. The connections are direct to metal wiring layers.
2. It is easier to use larger programming currents to reduce the antifuse resistance.

UV-Erasable programming:

Find the reason for referring EPROM technology as floating gate avalanche MOS. (Dec. 2013)
EPROM programming:
 In this type floating gate transistor is used.
 We can reprogram by using UV-light.
 High electric field causes electrons flowing towards drain to move across the insulating
gate oxide, where they trapped on the bottom, floating gate.
 These energetic electrons are HOT and this effect is known as Hot-electron injection (or)
avalanche injection.
 EPROM technology is sometimes called floating –gate avalanche MOS (FAMOS).

Figure: EPROM transistor


 (a) With a high (>12V) programming voltage, VPP applied to the drain. Electrons gain
enough energy to jump onto the floating gate (gate1).
 (b) Electrons stuck on gate1 raise the threshold voltage so that the transistor is always off
for normal operating voltages.
 (c) Ultraviolet light provides enough energy for electrons stuck on gate1 to jump back to
the bulk, allowing the transistor to operate normally.

EEPROM programming:
 Electrically Erasable programming is most popular CMOS technology.
 A very thin oxide between floating gate and the drain allow the electrons to tunnel to or
from the floating gate (gate is charged or discharged).
 Thus enabling writing and erasing operation.
Advantages:
 The advantages of EEPROM technology are:
 faster than using a UV lamp
 chips do not have to be removed from the system
 if the system contains circuits to generate both program and erase voltages, it may
use ISP

46

46
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

SRAM Programming
 SRAM programming is shown in figure.
 SRAM configuration cell is constructed from two cross-coupled inverters and uses a
standard CMOS process.
 The configuration cell drives the gates of other transistors on the chip (using pass
transistors or transmission gates) to make a connection or off to break a connection.
 The cell is programmed using the WRITE and DATA lines.

Figure: SRAM programming


Advantages:
 Designers can reuse chips during prototyping.
 Designers can update or change a system on the fly in reconfigurable hardware.
Disadvantage:
 Need to keep power supply for retaining the connection information.
******************************************************************************
******
 Explain the reprogrammable device architecture with neat diagrams.
 With neat diagram explain the functional blocks in PDA (Programmable Device
Architecture). (AU:June 2015, June 2016)
 With neat sketch explain the CLB, IOB and Programmable interconnects of an
FPGA device. (May 2016)
 Explain about building block architecture of FPGA. (April 2017, 2018, NOV 2018)
 Elucidate in detail the basic FPGA architecture. (April 2019-13M)
 Describe in detail FPGA architecture and explain the main building blocks of
FPGA. (Nov 2019)[Nov/Dec 2022]
 Illustrate the basic building block architectures of FPGA. [May 2021]

Re-Programmable Devices Architecture (FPGA)


 FPGA provide the next generation in the programmable logic devices.
 It refers to the ability of the gate arrays to be programmed for a specific function by the
user.
 The word Array is used to indicate a series of columns and rows of gates that can be
programmed by the end user.
 As compared to standard gate arrays, the field programmable gate arrays are larger
devices.
47

47
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 The basic cell structure for FPGA is complicated than the basic cell structure of standard
gate array.
 The programmable logic blocks of FPGA are called Configurable Logic Block (CLB).
 The FPGA architecture consists of three types of configurable elements-
(i) IOBs –Input/output blocks
(ii) CLBs- Configurable logic blocks
(iii) Resources for interconnection
 The IOBs provide a programmable interface between the internal, array of logic blocks
(CLBs) and the device’s external package pins.
 CLBs perform user-specified logic functions.
 The interconnect resources carry signals among the blocks.
 A configurable program stored in internal static memory cells.
 Configurable program determines the logic functions and the interconnections.
 The configurable data is loaded into the device during power-up reprogramming function.
 FPGA devices are customized by loading configuration data into internal memory cells.

The structure of FPGA:


The basic elements of the FPGA structure:

1.Logic blocks
 Based on memories (Flip-flop & LUT – Lookup Table) Xilinx
 Based on multiplexers (Multiplexers)-Actel
 Based on PAL/PLA - Altera
 Transistor Pairs
2. Interconnection Resources
 Symmetrical FPGA-s
 Row-based FPGA-s
 Sea-of-gates type of FPGA-s
 Hierarchical FPGA-s (CPLD)
3. Input-output cells (I/O Cell)
 Possibilities for programming :
a. Input
b. Output
c. Bidirectional

48

48
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
RE-PROGRAMMABLE DEVICE ARCHITECTURE:

Figure: FPGA building blocks structure


 The figure shows the general structure of FPGA chip.
 It consists of a large number of programmable logic blocks surrounded by programmable
I/O block.
Configurable Logic Block:

 Figure: Various configurable Logic Block


 The programmable logic blocks of FPGA are smaller and less capable than a PLD, but an
FPGA chip contains a lot more logic blocks to make it more capable.
 As shown in figure the logic blocks are distributed across the entire chip.
 These logic blocks can be interconnected with programmable inter connections.
 The programmable logic blocks of FPGAs are called Configurable Logic Blocks (CLBs).
 CLBs contain LUT, FF, logic gates and Multiplexer to perform logic functions.
 The CLB contains RAM memory cells and can be programmed to realize any function of
five variables or any two functions of four variables.
 The functions are stored in the truth table form, so the number of gates required to realize
the functions is not important.

49

49
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
Interconnection resources:

Figure: Types of interconnection resources

(a) Symmetrical Arrays


 It consists of logic elements (CLBs) arranged in rows and columns of a matrix and
interconnect laid out between them.
 This symmrtrical martrix is surrounded by I/O blocks which connect it to outside world.
(b) Row based architecture:
 It consists of alternating rows of logic modules and programmable interconnect tracks.
 Input output blocks is located in the periphery of the rows.
 One row may be connected to adjacent rows via vertical interconnect.
(c) Hierarchical CPLD:
 This architecture is designed in hierarchical manner with top level containing only logic
blocks and interconnects.
1.Connections within macrocells
2.Local connection resource within the logical block.
3.Global connection resource (Switch Matrix)
(d) Sea of gates structure:
 It consists of logic elements (CLBs) arranged in rows and columns of a matrix inthe
channel less gate arrays module.
I/O cells(Blocks):
 User-configurable input/output blocks (IOBs) provide the interface between external
package pins and the internal logic.
 Each IOB controls one package pin and can be configured for input, output, or
bidirectional signals.
 Figure shows a three-state bidirectional output buffer.

50

50
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 When the output enable, OE is ‘1’ the output section is enabled and drives the I/O pad.
 When OE is ‘0’ the output buffer is placed in a high-impedance state.

Figure: A three-state bidirectional output buffer


 We can limit the number of I/O drivers that can be attached to any one V DD and GND
pad.
 It allows employ the same pad for input and output bidirectional I/O.
 When we want to use the pad as an input, set OE low and take the data from DATAin.
 We can build output-only or input-only pads.
******************************************************************************
FPGA(PROGRAMMABLE ASIC )interconnect routing procedures (Architectures):

 Give short notes on FPGA interconnect routing procedures. (May 2016, May 2021)
 Describe FPGA interconnect routing resources with neat diagram. (April 2019-13M)
 Give a note on standard cell design and FPGA interconnecting resources. (Nov 2019)
[Apr/May 2022]

 Routing architecture comprises of programmable switches and many wires.


 Routing provides connection between logic blocks, I/O blocks, and between one logic
block and another logic block.
 The type of routing architecture decides area consumed by routing as well as density of
logic blocks.
 Routing techniques decide the amount of area used by wire segments and programmable
switches as compared to area consumed by logic blocks.

Types of FPGA interconnect routing procedures:


 Hierarchical Routing Architecture
 Island-Style Routing Architecture
 Xilinx Routing Architecture
 Altera Routing Architecture
 Actel Routing Architecture
(a) Hierarchical Routing Architecture:
 Hierarchical routing architectures separates FPGA logic blocks into distinct groups.

51

51
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
 Connections between the logic blocks within a group can be made using wire segments at
the lowest level of the routing hierarchy.
 Connections between the logic blocks in distant groups require the traversal of one or
more levels of routing segments.
 As shown in Figure, only one level of routing directly connects to the logic blocks.
 Programmable connections are represented with the crosses and circles.

Figure: Example of Hierarchical FPGA


(b) Xilinx Routing Architecture:
 In Xilinx routing, connections are made from logic block into the channel through a
connection block.
 As SRAM technology is used to implement Lookup Tables, connection sites are large.
 A logic block is surrounded by connection blocks on all four sides.

52

52
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Xilinix Routing Architecture


 They connect logic block pins to wire segments.
 Pass transistors are used to implement connection for output pins, while use of
multiplexers for input pins saves the number of SRAM cells required per pin.
 The logic block pins connecting to connection blocks can then be connected to any
number of wire segments through switching blocks.
 Figure shows the Xilinx routing architecture.
 There are four types of wire segments available:
 General purpose segments that pass through switches in the switch block.
 Direct interconnect connects logic block pins to four surrounding connecting
blocks
 Long line: high fan out uniform delay connections
 Clock lines: clock signal provider which runs all over the chip.

(c) Altera Routing Architecture :


 Altera routing architecture has two level hierarchies.
 At the first level of the hierarchy, 16 or 32 of the logic blocks are grouped into a
Logic Array Block (LAB).
 The channel here is set of wires that run vertically along the length of the FPGA.
 Figure shows Alter Max 5000 routing architecture.
 Tracks are used for four types of connections:
 Connections from output of all logic blocks in LAB.
 Connection from logic expanders.
 Connections from output of logic blocks in other LABs
 Connections to and from Input output pads

53

53
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

Figure: Altera Max 5000 Routing Architecture


 All four types of tracks connect to every logic block in the array block.
 Any track can connect to into any input which makes this routing simple.
 Advantage: It allows to be packed tightly and efficiently.
 Disadvantage: Large number of switches required, which adds to capacitive load.

(d) Island-Style Routing Architecture:


 As shown in Figure, island-style FPGAs logic blocks are arranged in a two dimensional
mesh with the routing resources evenly distributed throughout the mesh.
 An island-style global routing architecture typically has the routing channels on all four
sides of the logic blocks.
 The number of wires contained in the channel, W, is pre-set during fabrication, and is one
of the key choices made by the architect.
 It employs wire segments of different lengths in each channel to provide the most
appropriate length for each given connection.

Figure: Island-Style Routing Architecture


54

54
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
(e) Actel Routing Architecture:
 Actel's design has more wire segments in horizontal direction than in vertical direction.
 The input pins connect to all tracks of the channel that is on the same side as the pin.
 The output pins extend across two channels above the logic block and two channels
below it.
 Output pin can be connected to all 4 channels that it crosses.
 The switch blocks are distributed throughout the horizontal channels.
 All vertical tracks can make a connection with every incidental horizontal track.
 This allows for the flexibility that a horizontal track can switch into a vertical track, thus
allowing for horizontal and vertical routing of same wire.
 The drawback is more switches are required which add up to more capacitive load.

Figure: Actel Routing Architecture

55

55
INTERCONNECT:

Explain in detail about the interconnect.


Interconnect in an integrated circuit are physical connections between two transistors and/ or
the externalsurroundings.

➢ An electronic circuit designer has multiple choices in realizing the interconnections between
the various devices that make up the circuit.
➢ Here the start of the art processes offers multiple layers of aluminium or copper, and at least
one layer of polysilicon. Even the heavily doped n+ and p+ diffusion layers are typically used for
the realization of source and drain regions can be employed for wiring purposes. These wires
appear in the schematic diagrams of electronic circuit as simple lines with no apparent impact
on the circuit performance.

These wiring of integrated circuits forms a complex geometry that introduces the following
parasitics:

1. Capacitive Parasitics
2. Resistive Parasitics and
3. Inductive Parasitics

The capacitive, resistive and inductive parasitics have multiple effects of the circuit’s behaviour
i.e.

➢ They all cause an increase in propagation delay, or equivalent, a drop in performance.


➢ They all have an impact on the energy dissipation and the power distribution.
➢ They all cause the introduction of extra noise sources, which affect the reliability of the circuit.

(FOR UNDERSTANDING- NO NEED TO DRAW)


SCHEMATIC AND PHYSICAL VIEWS OF WIRING OF BUS-NETWORK

It is important that the designer has a clear insight in the parasitic wiring effects, their relative
importance, and their models. This is best illustrated with the simple example as shown above.
Each wire in a bus network connects a transmitter (or transmitters) to a set of receivers and is
implemented as a link of wire segments of various lengths and geometries. Assume that all
segments are implemented on a single interconnect layer, isolated from the silicon substrate
and from each other by a layer of dielectric material. Be aware that the reality may be far more
complex.
56

56
Analyzing the behavior of this schematic, which only models a small part of the circuit, is slow
and cumbersome. Fortunately, substantial simplifications can often be made, some of which are
enumerated below,
➢ Inductive effects can be ignored if the resistance of the wire is substantial- this is for instance
the case for long Aluminum wires with a small cross-section- or if the rise and fall times of the
applied signals are slow.
➢ When the wires are short, the cross-section of the wire is large, or the interconnect material
used has a low resistivity, a capacitance- only model can be used (figure FOR WIRE
PARASITICS WITH CAPACITANCE ONLY shown below)
➢ The separation between neighbouring wires is large, or when the wires only run together for a
short distance, inter-wire capacitance can be ignored, and all the parasitic capacitance can be
modeled as capacitance to ground. Obviously, the latter problems are the easiest to model,
analyze, and optimize.

WIRE PARASITICS (WITH THE EXCEPTION OF INTER-WIRE RESISTANCE AND MUTUAL INDUCTANCE) WIRE PARASITICS WITH
CAPACITANCE ONLY
WIRE MODELS FOR PARASITICS

The various interconnect parameters whose values can be estimated, simple models to evaluate
their impact, and a set of rules- of- thumb to decide i.e. when and where a particular model or
effect should be consideredare:
1. Capacitance Parameter
2. Resistance Parameter
3. Inductance Parameter

CAPACITANCE INTERCONNECT PARAMETER:

The capacitance of such a wire is a function of its shape, its environment, its distance to the
substrate, and the distance to surrounding wires. An accurate modeling of the wire
capacitance(s) in a state-of-the-art integrated circuit is a non-trivial task and is even today the
subject of advanced research.
In capacitance parameter there are two types of capacitance occurring i.e.
1. Parallel plate Capacitance and
2. Fringe Capacitance
57

57
Parallel plate Capacitance:
Consider first a simple rectangular wire placed above the semiconductor substrate, as shown in
figure below. If the width of the wire is substantially larger than the thickness of the insulating
material, it may be assumed that the electrical-field lines are orthogonal to the capacitor plates,
and that its capacitance can be modeled by the parallel-plate capacitance model. Under those
circumstances, the total capacitance of the wire can be approximated as,

Where W and L are respectively the width and length of the wire, and tdi and εdi represent the
thickness of the dielectric layer and its permittivity. SiO2 is the dielectric material of choice in
integrated circuits, although some materials with lower permittivity, and hence lower
capacitance, are coming in use.

PARALLEL-PLATE CAPACITANCE MODEL OF INTERCONNECT WIRE


In actuality, this model is too simplistic. To minimize the resistance of the wires while scaling
technology, it is desirable to keep the cross-section of the wire (W×H) as large as possible. On
the other hand, small values of W lead to denser wiring and less area overhead. As a result, we
have over the years witnessed a steady reduction in the W/H- ratio, such that it has even dropped
below unity in advanced processes.
Fringe/ Fringing Capacitance:
The capacitance between the side-walls of the wires and the substrate, called the fringing
capacitance, as shown below.

Fringing fields/ the fringing-field capacitance model of fringing-field capacitance- decomposes the
Capacitance into two contributions: a parallel-plate
capacitance, and a fringing capacitance, modeled by a cylindrical
wire with a diameter equal to the thickness of the wire

58
Therefore, the parallel plate capacitance and fringing capacitance constitutes the overall
capacitance. Which is given as,

With w = W - H/2 a good approximation for the width of the parallel-plate capacitor.

CAPACITANCE COUPLING/ CAPACITANCE COUPLING EFFECT:

Assuming that a wire is completely isolated from its surrounding structures and is only
capacitively coupled to ground, becomes untenable. This is illustrated in figure, where the
capacitance components of a wire embedded in an interconnect hierarchy are identified. Each
wire is not only coupled to the grounded substrate, but also to the neighbouring wires on the
same layer and on adjacent layers. The main difference is that not all its capacitive components
do terminate at the grounded substrate, but that a large number of them connect to other wires,
which have dynamically varying voltage levels, these floating capacitors causes crosstalk and a

negative effect to the circuit also.


CAPACITIVE COUPLING BETWEEN WIRES IN INTERCONNECT HIERARCHY INTERCONNECT CAPACITANCE AS A
FUNCTION OF DESIGN RULES

Inter- wire capacitances become a dominant factor in multi- layer interconnect structures. This
effect is more important for wires in the higher interconnect layers, as these wires are farther
away from the substrate. The increasing contribution of the inter- wire capacitance to the total
capacitance with decreasing feature sizes is illustrated by graphical figure as shown, which plots
the capacitive components of a set of parallel wires routed above a ground plane, it is assumed
that dielectric and wire thickness are held constant while scaling all other dimensions. When W
becomes smaller than 1.75 H, the inter-wire capacitance starts to dominate.

59
Wiring Capacitances for 0.25 µm CMOS Technology:
The table rows represent the top plate of the capacitor, the columns the bottom plate. The area
capacitancesare expressed in aF1/µm2, while the fringe capacitances (given in the shaded rows)

are in aF/µm.

Inter- Wire Capacitance per unit wire length for different interconnect layers of 0.25
µm CMOSTechnology Process:
The capacitances are expressed in aF/mm, and are for minimally-spaced wires

RESISTANCE INTERCONNECT PARAMETER:

The resistance of a wire is proportional to its length L and inversely proportional to its
cross- section A.The resistance of a rectangular conductor as shown in figure below can be
expressed as,

60
Where:

ρ = resistivity

A = HW = area of cross section of the rectangular wireIf L = W, i.e. square of resistive material,

then

the sheet resistance of the material, having units of Ω/ sq. This expresses that the resistance
of a square conductor is independent of its absolute size, as is apparent from

To obtain the resistance of a wire, simply multiply the sheet resistance by its ratio (L/ W).

Resistivity of Commonly used Conductors/ Interconnect Resistance:

Aluminum is the interconnect material most often used in integrated circuits because of its low
cost and its compatibility with the standard integrated- circuit fabrication process.
Unfortunately, it has a large resistivity compared to materials such as Copper. With ever-
increasing performance targets, this is rapidly becoming a liability and top- of- the- line
processes are now increasingly using Copper as the conductor of choice.

Typical values of the Sheet Resistance of various Interconnect Materials using 0.25 µm
CMOS Technology:

61
From the table, we conclude that Aluminum is the preferred material for the wiring of long
interconnections. Polysilicon should only be used for local interconnect. Although the sheet
resistance of the diffusion layer (n+, p+) is comparable to that of polysilicon, the use of
diffusion wires should be avoided due to its large capacitance and the associated RC delay.

INDUCTANCE INTERCONNECT PARAMETER:

The inductance of a section of a circuit states that a changing current passing through an
inductor generates avoltage drop ΔV.

On-chip inductance include ringing and overshoot effects, reflections of signals due to
impedance mismatch, inductive coupling between lines, and switching noise due to Ldi/dt
voltage drops.

It is possible to compute the inductance a wire directly from its geometry and its environment.
A simpler approach relies on the fact that the capacitance c and the inductance l (per unit
length) of a wire are relatedby the following expression,

With and µ respectively the permittivity and permeability of the surrounding dielectric.

Other interesting relations, obtained from Maxwell’s laws, can be pointed out. The constant
product of permeability and permittivity also defines the speed at which an electromagnetic
wave can propagate through the medium,

c0 equals the speed of light (30 cm/ nsec) in a vacuum.

Considering a lumped RLC model we get 𝑍𝑅𝐿 = 𝑅 + 𝐽𝜔𝐿, 𝑤h𝑒𝑟𝑒 𝜔 = 2𝜋𝑓.

𝐼𝑓 𝑅 ≫ 𝐽𝜔𝐿, 𝑡h𝑒𝑛 𝑖𝑛𝑑𝑢𝑐𝑡𝑎𝑛𝑐𝑒 𝑒𝑓𝑓𝑒𝑐𝑡 𝑖𝑠 𝑛𝑜𝑡 𝑖𝑚𝑝𝑜𝑟𝑡𝑎𝑛𝑡.

𝐼𝑓 𝑅 ≪ 𝐽𝜔𝐿, 𝑡h𝑒𝑛 𝑖𝑛𝑑𝑢𝑐𝑡𝑎𝑛𝑐𝑒 𝑒𝑓𝑓𝑒𝑐𝑡 𝑤𝑖𝑙𝑙 𝑏𝑒 𝑜𝑏𝑠𝑒𝑟𝑣𝑒𝑑 𝑖𝑛 𝑠𝑖𝑔𝑛𝑎𝑙 𝑟𝑒𝑑𝑢𝑐𝑡𝑖𝑜𝑛 𝑖𝑛 𝑟𝑒𝑠𝑖𝑠𝑡𝑎𝑛𝑐𝑒

62
Dielectric constants and wave-propagation speeds for various materials used in electronic
circuits;(The relative permeability µr of most dielectrics is approximately equal to 1)

INTERCONNECT MODELING:

Describe about interconnect modelling.


As we know the parasitic elements have an impact on the electrical behaviour of the circuit and
influence its delay, power dissipation, and reliability. To study these effects requires the
introduction of electrical models that estimate and approximate the real behaviour of the wire as
a function of its parameters. These models vary from very simple to very complex depending
upon the effects that are being studied and the required accuracy.

The types of interconnect modelling are:


1. Lumped Model
2. Lumped RC Model- The Elmore Delay
3. Distributed RC line Model/ Distributed rc line Model
4. Transmission Line Model

Lumped Model:

The circuit parasitics of a wire are distributed along its length and are not lumped into a
single position. Yet, when only a single parasitic component is dominant, when the interaction
between the components is small, or when looking at only one aspect of the circuit behaviour, it
is often useful to lump the different fractions into a single circuit element. The advantage of this
approach is that the effects of the parasitic then can be described by an ordinary differential
equation.

As long as the resistive component of the wire is small and the switching frequencies are in the
low to medium range, it is meaningful to consider only the capacitive component of the wire,
and to lump the distributed capacitance into a single capacitor as shown in figure. It is observed
that in this model the wire still represents an equipotential region, and that the wire itself does
not introduce any delay. The only impact on performance is introduced by the loading effect of
the capacitor on the driving gate. This capacitive lumped model is simple, yet effective, and is
the model of choice for the analysis of most interconnect wires in digital integrated circuits.

63
DISTRIBUTED VERSUS LUMPED CAPACITANCE MODEL OF WIRE. CLUMPED = L×CWIRE, WITH L THE
LENGTH OF THE WIRE AND CWIRE THE CAPACITANCE PER UNIT LENGTH. THE DRIVER IS
MODELED AS A VOLTAGE SOURCE AND A SOURCE RESISTANCE RDRIVER

The operation of this simple RC network is described by the following ordinary differential
equation,

Lumped RC Model/ The Elmore Delay:

On-chip metal wires of over a few mm length have a significant resistance. The equipotential
assumption, presented in the lumped-capacitor model, is no longer adequate, and a resistive-
capacitive model has to be adopted.

A first approach lumps the total wire resistance of each wire segment into one single R and
similarly combines the global capacitance into a single capacitor C. This simple model, called
the lumped RC model, is pessimistic and inaccurate for long interconnect wires, which are more
adequately represented by a distributed rc-model. Yet, before analyzing the distributed model, it
is worthwhile to spend some time on the analysis and the modeling of lumped RC networks for
the following reasons:
1. The distributed rc-model is complex and no closed form solutions exist. The behaviour of the
distributed rc- line can be adequately modeled by a simple RC network.
2. A common practice in the study of the transient behavior of complex transistor-wire networks
is to reduce the circuit to an RC network. Having a means to analyze such a network effectively
and to predict its first-order response would add a great asset to the designers tool box.

TREE- STRUCTURED RC NETWORK

64
An interesting result of this particular circuit topology is that there exists a unique resistive path
between the source node s and any node i of the network. The total resistance along this path is
called the path resistance Rii. For example, the path resistance between the source node s and

node 4 equals,
The definition of the path resistance can be extended to address the shared path resistance Rik,
which represents the resistance shared among the paths from the root node s to nodes k and i:

Here,
Ri4 = R1 + R3 while Ri2 = R1

Assume now that each of the N nodes of the network is initially discharged to GND, and that a
step input is applied at node s at time t = 0. The Elmore delay at node i is then given by the
following expression:

Therefore, the Elmore delay is equivalent to the first-order time constant of the network (or the
first moment of the impulse response). The designer should be aware that this time- constant
represents a simple approximation of the actual delay between source node and node i. Yet in
most cases this approximation has proven to be quite reasonable and acceptable. It offers the
designer a powerful mechanism for providing a quick estimate of the delay of a complex
network.
The RC delay of a tree structured network is given as,

i.e. using

We can compute the Elmore delay for node i

RC Chain/ The Elmore RC Chain Delay:


As a special case of the RC tree network, let us consider the simple, non- branched RC chain (or
ladder) shown in figure. This network is worth analyzing because it is a structure that is often
encountered in digital circuits, and also because it represents an approximative model of a
resistive-capacitive wire. The Elmore delay of this chain network can be derived with the aid of

As

65
RC CHAIN MODEL

The component of node 1 consists of C1R1 with R1 the total resistance between the node and the
source, while the contribution of node 2 equals C2(R1 + R2). The equivalent time
constant at node 2 equalsC1R1 + C2(R1 + R2). i of node i can be derived in a similar way.

Thus, the Elmore delay formula has proven to be extremely useful. Besides making it possible
to analyze wires, the formula can also be used to approximate the propagation delay of complex
transistor networks. The evaluation of the propagation delay is then reduced to the analysis of
the resulting RC network. More precise minimum and maximum bounds on the voltage
waveforms in an RC tree have further been established.

Distributed RC line Model/ Distributed rc line Model:


A distributed rc line model is a more appropriate model as shown below which has, r and c
stand for the resistance and capacitance per unit length.

DISTRIBUTED RC LINE MODEL

SCHEMATIC SYMBOL FOR DISTRIBUTED RC LINE

66
The voltage at node i of this network can be determined by solving the following set of partial
differential equations:

The correct behavior of the distributed rc line is then obtained by reducing ΔL


asymptotically to 0. For , the above equation becomes the well-known diffusion

equation:
Where V is the voltage at a particular point in the wire, and x is the distance between this point
and the signal source. No closed form solution exists for this equation, but approximative
expressions such as the formula written below can be derived:

The graph below shows the response of a wire to a step input, plotting the waveforms at
different points in the wire as a function of time. It is observable how the step waveform
“diffuses” from the start to the end of the wire, and the waveform rapidly degrades, resulting in
a considerable delay for long wires. Driving these rc lines and minimizing the delay and signal
degradation is one of the trickiest problems in modern digital integrated circuit design.

SIMULATED STEP RESPONSE OF RESISTIVE-CAPACITIVE WIRE AS A FUNCTION OF TIME AND


PLACE

rc delays should only be considered when the rise (fall) time at the line input is smaller
than RC, the rise (fall) time of the line.

With R and C the total resistance and capacitance of the wire. When this condition is not met,
the change in signal is slower than the propagation delay of the wire, and a lumped capacitive
model suffices.

67
SIMULATION π AND T MODELS FOR DISTRIBUTED RC LINE

Step response of lumped and distributed RC networks- Points of Interest:

The Transmission Line Model:


Similar to the resistance and capacitance of an interconnect line, the inductance is distributed
over the wire. A distributed rlc model of a wire, known as the transmission line model, becomes
the most accurate approximation of the actual behaviour.

The transmission line has the prime property that a signal propagates over the interconnection
medium as a wave. This is in contrast to the distributed rc model, where the signal diffuses from
the source to the destination governed by the diffusion equation i.e.

In the wave mode, a signal propagates by alternatively transferring energy from the electric to
the magnetic fields, or equivalently from the capacitive to the inductive modes.

68
A LOSSY TRANSMISSION LINE

Consider the point x along the transmission line of figure as shown above at time t. The
following set of equations holds:

Assuming that the leakage conductance g equals 0, which is true for most insulating materials,
and eliminating the current i yields the wave propagation equation,

where r, c, and l are the resistance, capacitance, and inductance per unit length respectively.

COPING WITH INTERCONNECT:

As till now we have concentrated on the growing impact of interconnect parasitics on all design
metrics of digital integrated circuits. As mentioned, interconnect introduces three types of
parasitic effects i.e. capacitive, resistive, and inductive- all of which influence the signal
integrity and degrade the performance of the circuit. While so far we have concentrated on the
modeling aspects of the wire, we now analyze how interconnect affects the circuit operation,
and we present a collection of design techniques to cope with theseeffects with considering each
parasitics- this is referred to as coping with interconnect.

CAPACITIVE PARASITICS:
➢ Capacitance reliability and Cross talk:
An unwanted coupling from a neighbouring signal wire to a network node introduces an
interference that is generally called cross talk. The resulting disturbance acts as a noise source
and can lead to hard-to-trace intermittent errors, since the injected noise depends upon the
transient value of the other signals routed in the neighbourhood. In integrated circuits, this inter
signal coupling can be both capacitive and inductive.

Capacitive cross talk is the dominant effect at current switching speeds, although inductive
coupling forms a major concern in the design of the input-output circuitry of mixed-signal
circuits. The potential impact of capacitive crosstalk is influenced by the impedance of the line
under examination. If the line is floating, the

69
disturbance caused by the coupling persists and may be worsened by subsequent switching
on adjacentwires. If the wire is driven, on the other hand, the signal returns to its original level.

o Floating Lines:

Approaching capacitive parasitic with respect to capacitive coupling to a floating line.

CAPACITIVE COUPLING TO A FLOATING LINE

Considering the circuit shown as above, where line X is coupled to wire Y by a parasitic
capacitance CXY. Line Y sees a total capacitance to ground equal to CY. Assuming that the
voltage at node X experiences a step change equal to ΔVX. This step appears on node Y
attenuated by the capacitive voltage divider.

Circuits that are particularly susceptive to capacitive cross talk are networks with low- swing
pre chargednodes, located in adjacent to full- swing wires (with ΔVX = VDD).

Examples of these are dynamic memories, low swing on chip busses and some dynamic

families.To address the cross talk issue, level- restoring device or keepers are a must in dynamic

logic.

o Driven Lines:

CAPACITIVE COUPLING TO A DRIVEN LINE AND ITS VOLTAGE RESPONSE

70
As seen from the figure, if the line Y is driven with a resistance RY, a step on line X results in
a transient on line Y. The transient decays with a time constant τXY = RY (CXY + CY ). The
actual impact on the victim line is a strong function of the rise- fall time of the interfering
signal.

If the rise time is comparable or larger than the time constant, the peak value of disturbance is
diminished. This can be observed in the response figure.

Obvious, keeping the driving impedance of a wire and hence τXY low goes a long way towards
reducing the impact of capacitive cross talk. The keeper transistor added to a dynamic gate or
pre charged wire is an excellent example of how impedance reduction helps to control noise.

Therefore, the impact of cross talk on the signal integrity of driven nodes is rather limited. The
resulting glitches may cause malfunctioning of connecting sequential elements, and should
therefore be carefully monitored. The most important effect is an increase in delay.

Design Techniques to Deal with Capacitive Cross talk:

1. If possible avoid floating nodes, nodes sensitive to cross talk problems such as pre charged
busses,should be equipped with keeper devices to reduce the impedance.
2. Sensitive nodes should be well separated from full swing signals.
3. Making the rise- fall time as large as possible subjection to timing constraints.
4. Use differential signalling in sensitive low swing wiring networks. This turns the cross talk
signal into a common mode noise source that does not impact the operation of the circuit.
5. To keep the cross talk minimum, do not allow the capacitance between the two signal wires to
grow too large.
6. If necessary provide shielding wire- GND or VDD between the two signals as show below. This
effectively turns the interwire capacitance into a capacitance to ground and eliminates
interference. An adverse effect of shielding is the increased capacitive load.

CROSS SECTION OF ROUTING LAYERS ILLUSTRATING THE USE OF SHIELDING TO REDUCE


CAPACITIVE CROSS TALK

7. The interwire capacitance between signals on different layers can be further reduced by
addition of extra routing layers.

71
➢ Impact of Cross talk on Propagation Delay (With respect to CMOS):

IMPACT OF CROSS TALK ON PROPAGATION DELAY

The circuit schematic illustrates of how capacitive cross talk may result in a data-dependent
variation of the propagation delay. Assume that the inputs to the three parallel wires X, Y, and Z
experience simultaneous transitions. Wire Y (called the victim wire) switches in a direction that
is opposite to the transitions of its neighbouring signals X and Z. The coupling capacitances
experience a voltage swing that is double the signal swing, and hence represent an effective
capacitive load that is twice as large as Cc- the by now well known Miller effect.
Since the coupling capacitance represents a large fraction of the overall capacitance in the deep-
submicron dense wire structures, this increase in capacitance is substantial, and has a major
impact on the propagation delay of the circuit. Observe that this is a worst-case scenario. If all
inputs experience a simultaneous transition in the same direction, the voltage over the coupling
capacitances remains constant, resulting in a zero contribution to the effective load capacitance.

The total load capacitance CL of gate Y, hence depends upon the data activities on the
neighbouring signals and varies between the following bounds:

with CGND the capacitance of node Y to ground, including the diffusion and fan out capacitances.

72
Design Techniques for Circuit Fabrics with Predictable delay:

With cross talk making wire-delay more and more unpredictable, a designer can choose
between a numberof different methodology options to address the issue, some of which are,

1. Evaluate and improve: After detailed extraction and simulation, the bottlenecks in
delay are identified,and the circuit is appropriately modified.
2. Constructive layout generation: Wire routing programs take into account the effects of
the adjacent wires,ensuring that the performance requirements are met.
3. Predictable structures: By using predefined, known, or conservative wiring structures,
the designer is that the circuit will meet his specifications and that cross talk will not be a show
stopper.

➢ Capacitive Load (With respect to CMOS):


The increasing values of the interconnect capacitances, especially those of the global wires,
emphasize the need for effective driver circuits that can (dis)charge capacitances with sufficient
speed. This need is further highlighted by the fact that in complex designs a single gate often
has to drive a large fan-out and hence hasa large capacitive load.

Typical examples of large on-chip loads are busses, clock networks, and control wires. The
latter include, for instance, reset and set signals. These signals control the operation of a large
number of gates, so fan-out is normally high. Other examples of large fan-outs are encountered
in memories where a large number of storage cells is connected to a small set of control and
data wires.

The capacitance of these nodes is easily in the multi-pico farad range. The worst case occurs
when signals go off-chip. In this case, the load consists of the package wiring, the printed circuit
board wiring, and the input capacitance of the connected ICs or components.

Typical off-chip loads range from 20 to 50 pF, which is multiple thousand times larger than a
standard on- chip load. Driving those nodes with sufficient speed becomes one of the most
crucial design problems.

The main secrets to the efficient driving of large capacitive loads are:
1. Adequate transistor sizing is instrumental when dealing with large loads.
2. Partitioning drivers into chains of gradually-increasing fers helps to deal with large fan out
factors.

73
RESISTIVE PARASITICS:

➢ Resistance and Reliability- Ohmic Voltage Drop:


Current flowing through a resistive wire results in an ohmic voltage drop that degrades the
signal levels. This is especially important in the power distribution network, where current
levels can easily reach amperesas shown below.

EVOLUTION OF POWER SUPPLY CURRENT AND SUPPLY VOLTAGE OHMIC VOLTAGE DROP ON THE SUPPLY
REDUCES NOISE MARGIN

Consider a 2 cm long VDD or GND wire with a current of 1mA per µm width. This current is
about the maximum that can be sustained by an aluminum wire due to electromigration
and assuming asheet resistance of 0.05 Ω/sq, the resistance of this wire (per µm width) equals
1 kΩ. A current of 1 mA/µm would result in a voltage drop of 1 V. The altered value of the
voltage supply reduces noise margins and changes the logic levels as a function of the distance
from the supply terminals. This is demonstrated by the circuit shown above, where an inverter
placed far from the power and ground pins connects to a devicecloser to the supply.

The difference in logic levels caused by the IR voltage drop over the supply rails might partially
turn on transistor M1. This can result in an accidental discharging of the pre charged,
dynamic node X, or causestatic power consumption if the connecting gate is static. In short, the
current pulses from the on-chip logic, memories and I/O pins cause voltage drops over the
power- distribution network and are the major source for on- chip power supply noise. Beyond
causing a reliability risk, IR drops on the supply network also impact the performance of the
system. A small drop in the supply voltage may cause a significant increase indelay.

The most obvious problem is to reduce the maximum distance between the supply pins and the
circuit supply connections which is most easily accomplished through a structured layout of the
power distribution network. A number of on- chip power distribution networks with peripheral
bonding.

➢ Electromigration:
The current density (current per unit area) in a metal wire is limited due to an effect called
electromigration. A direct current in a metal wire running over a substantial time period,
causes a transport of the metal ions. Eventually, this causes the wire to break or to short
circuit to another wire. This type of failure will only occur after the device has been in use for
some time.

74
Line Open Failure Open Failure in Contact Plug
ELECTROMIGRATION RELATED FAILURE MODES

The rate of the electromigration depends upon the temperature, the crystal structure, and the
average current density. The latter is the only factor that can be effectively controlled by the
circuit designer. Keeping the current below 0.5 to 1 mA/ µm normally prevents migration. This
parameter can be used to determine the minimal wire width of the power and ground network.
Signal wires normally carry an ac- current and are less susceptible to migration. The
bidirectional flow of the electrons tends to anneal any damage done to the crystal structure.
Most companies impose a number of strict wire-sizing guidelines on their designers, based on
measurements and past experience.

Electromigration effects are proportional to the average current flow through the wire, while IR
voltage drops are a function of the peak current.

From designing point of view, at the technology level, a number of precautions can be taken
to reduce themigration risk i.e.
1. To add alloying elements (such as Cu or Tu) to the aluminum to prevent the movement of
the Alions.
2. To control the granularity of the ions.
3. The introduction of new interconnect materials is a big help as well. For instance, the use of
Copper interconnect increases the expected lifetime of a wire with a factor of 100 over Al.

➢ Resistance and Performance- RC Delay:


The delay of a wire grows quadratically with its length. Doubling the length of a wire
increases its delay by a factor of four. The signal delay of long wires therefore tends to be
dominated by the RC effect. This is becoming an ever larger problem in modern technologies,
which feature an increasing average length of the global wires, at the same time that the average
delay of the individual gates is going down. This leads to the rather bizar situation that it may
take multiple clock cycles to get a signal from one side of a chip to its opposite end.

75
Providing accurate synchronization and correct operation becomes a major challenge under
these circumstances. Therefore the different design techniques to cope with the delay imposed
by the resistance ofthe wire are,

o Better Interconnect Materials:


Use better interconnect materials when they are available and appropriate. The introduction of
silicides and Copper have helped to reduce the resistance of polysilicon (and diffused) and
metal wires, respectively, while the adoption of dielectric materials with a lower permittivity
lowers the capacitance. Both Copper and low- permittivity dielectrics have become common in
advanced CMOS technologies.

Here the designer should be aware that these new materials only provide a temporary respite of
one or two generations, and do not solve the fundamental problem of the delay of long wires.
Innovative design techniques are often the only way of coping with the latter.

Sometimes, it is hard to avoid the use of long polysilicon wires. A good example of such
circumstance are the address lines in memories, which must connect to a large number of
transistor gates. Keeping the wires in polysilicon increases the memory density substantially by
avoiding the overhead of the extra metal contacts. The polysilicon- only option unfortunately
leads to an excessive propagation delay. One possible solution is to drive the word line from
both ends, as shown in Figure. This effectively reduces the worst-case delay by a factor of four.
Another option is to provide an extra metal wire, called a bypass, which runs parallel to the
polysilicon one, and connects to it every k cells as shown in figure. The delay is now dominated
by the much shorter polysilicon segments between the contacts. Providing contacts only every k

cells helps to preserve the implementation density.


APPROACHES TO REDUCE THE WORD LINE DELAY

76
o Better Interconnect Strategies:
The length of the wire being a prime factor in both the delay and the energy consumption of an
interconnectwire, any approach that helps to reduce the wire length is bound to have an essential
impact.
There are two wiring strategies i.e. the Manhattan- Style Routing and Diagonal- Style Routing.

➢ In Manhattan style routing, interconnections are first routed along the one of the preferred
directions,followed by a connection in the other direction as shown.

➢ In Diagonal style routing less size of the wire length is required, on comparison to Manhattan
29% in best case. And the use of 45°lines is ironical in integrated circuits. The main issues of
diagonal routing are its complexity, impact on tools and masking concerns.

MANHATTAN VS DIAGONAL ROUTING LAYOUT EXAMPLE OF 45° LINES (FOR UNDERSTANDING- NO


NEED TO DRAW)

Earlier Manhattan routing was preferred because of the issues of diagonal routing inspite of
its features. Now diagonal routing is preferred due to its features i.e. less wire length and 45°
lines, its issues of complexity, impact on tools and masking concerns are easily overcomed
nowadays by using CAD tools (Computer Aided Design Tools) like Cadence. Therefore the
impact on wiring is quite tangible, a reduction of 20% in wire length, resulting in higher
performance, lower power dissipation and smaller chip area.
o Introducing Repeaters/ Buffer Insertion for very long wires:
The most popular design approach to reducing the propagation delay of long wires is to
introduce intermediate buffers, also called repeaters, in the interconnect line as shown below.

REDUCING RC INTERCONNECT DELAY BY USING REPEATERS

77
Making an interconnect line m times shorter reduces its propagation delay quadratically, and is
sufficient to offset the extra delay of the repeaters when the wire is sufficiently long. Assuming
that the repeaters have a fixed delay tpbuf , we can derive the delay of the partitioned wire.

𝜕 𝑡𝑝
The optimal number of buffers that minimizes the overall delay can be found by setting = 0,
𝜕𝑚

yielding a minimum delay of,

and is obtained when the delay of the individual wire segments is made equal to that of a
repeater.

o Optimizing the Interconnect Architecture:


Even with buffer insertion, the delay of a resistive wire cannot be reduced below the minimum
dictated by Equation,

Long wires hence often exhibit a delay that is longer than the clock period of the design. For
instance, the 10 cm long Al wire of comes with a minimum delay of 4.7 nsec, even after optimal
buffer insertion and sizing, while the 0.25 m CMOS process featured in this text can sustain
clock speeds in excess of 1 GHz (this is, clock periods below 1 nsec). The wire delay all-by-
itself hence becomes the limiting factor on the performance achievable by the integrated circuit.
The only way to address this bottleneck is to tackle it at thesystem architecture-level.

Wire pipelining is a popular performance- improvement technique in this category which


improves the throughput performance of logic modules with long critical paths. Similar
approach can be used to increase the throughput of a wire, as is illustrated in figure below.

WIRE PIPELINING IMPROVES THE THROUGHPUT OF A WIRE

The wire is partitioned in k segments by inserting registers or latches. While this does not
reduce the delay through the wire segment, it takes k clock cycles for a signal to proceed
through the wire, it helps to increase its throughput, as the wire is handling k signals
simultaneously at any point in time. The delay of the individual wire segments can further be
optimized by repeater insertion, and should be below a single clock period.

78
This is only one example of the many techniques that the chip architect has at her disposal to
deal with the wire delay problem. The most important concern from this is that the wires have to
be considered early on in the design process, and can no longer be treated as an afterthought as
was most often the case in the past.
INDUCTIVE PARASITICS:
Interconnect wires also exhibit an inductive parasitic. An important source of parasitic
inductance is introduced by the bonding wires and chip packages. Even for intermediate- speed
CMOS designs, the current through the input- output connections can experience fast transitions
that cause voltage drops as well as ringing and overshooting, phenomena not found in RC
circuits. At higher switching speeds, wave propagation and transmission line effects can come
into the picture.
➢ Inductance and Reliability- 𝑳 𝒅𝒊 𝑽𝒐𝒍𝒕𝒂𝒈𝒆 𝑫𝒓𝒐𝒑:
𝒅𝒕
During each switching action, a transient current is sourced from (or sunk into) the supply rails
to charge (or discharge) the circuit capacitances as shown. Both VDD and VSS connections are
routed to the external supplies through bonding wires and package pins and possess a non
ignorable series inductance. Hence, a change in the transient current creates a voltage difference
between the external and internal (V’DD, GND’) supply voltages. This situation is especially
severe at the output pads, where the driving of the large external capacitances generates large
current surges. The deviations on the internal supply voltages affect the logic levels and result in
reduced noise margins.

INDUCTIVE COUPLING BETWEEN EXTERNAL AND INTERNAL SUPPLY VOLTAGES

In an actual circuit, a single supply pin serves a large number of gates or output drivers. A
simultaneous switching of those drivers causes even worse current transients and voltage drops.
As a result, the internal supply voltages deviate in a substantial way from the external ones. For
instance, the simultaneous switching of the 16 output drivers of an output bus would cause a
voltage drop of at least 1.1 V if the supply connections of the buffers were connected to the
same pin on the package. Improvements in packaging technologies are leading to ever-
increasing numbers of pins per package. Packages with up to 1000 pins are currently available.
Simultaneous switching of a substantial number of those pins results in huge spikes on the
supply rails that are bound to disturb the operation of the internal circuits as well as other
external components connected to the same supplies.

79
Design techniques to address

1. Separate pins for I/O pads and chip core. Since the I/O drivers require the largest switching
currents, they also cause the largest current changes. Therefore, it is wise to isolate the core of
the chip where most of the logic action occurs, from the drivers by providing different power
and ground pins.
2. Multiple power and ground pins in order to reduce the per supply pin, we can restrict the
number of I/O drivers connected to a single supply pin.
3. Careful selection of positions of the power and ground pins on the package. The inductance of
pins located at the corners of the package is substantially higher as shown below.

THE INDUCTANCE OF A BONDING WIRE/ PIN COMBINATION DEPENDS UPON THE PIN
POSITIONS

4. Schedule current consuming transitions so that they do not occur simultaneously.


5. Increase the rise and fall times of the off-chip signals to the maximum extent allowable, and
distributed all over the chip, especially under the data busses.
6. Use advanced packaging technologies such as surface-mount or hybrids that come with a
substantially reduced capacitance and inductance per pin.
7. Adding decoupling capacitances on the board. These capacitances, which should be added for
every supply pin, act as local supplies and stabilize the supply voltage seen by the chip. They
separate the bonding- wire inductance from the inductance of the board interconnect as shown
below. The bypasscapacitor, combined with the inductance, actually acts as a low- pass network
that filters away the high-frequency components of the transient voltage spikes on the supply
lines.

DECOUPLING CAPACITORS ISOLATE THE BOARD INDUCTANCE FROM THE BONDING WIRE AND
IN INDUCTANCE

80
➢ Inductance and Performance- Transmission Line Effects:
When an interconnection wire becomes sufficiently long or when the circuits become
sufficiently fast, the inductance of the wire starts to dominate the delay behaviour, and
transmission line effects must be considered. This is more precisely the case when the rise and
fall times of the signal become comparable to the time of flight of the signal waveform across
the line as determined by the speed of light. As advancing technology increases line lenghts and
switching speeds, this situation is gradually becoming common in fastest CMOS circuits as
well, and transmission- line effects are bound to become a concern of the CMOS designer as
well.
Some of the techniques to minimize the impact of the transmission line behaviour are:
o Termination:
To avoid the negative effects of transmission-line behaviour such as ringing or slow
propagation delays, the line should be terminated, either at the source (series termination), or at
the destination (parallel termination) with a resistance matched to its characteristic impedance
Z0.

MATCHED TERMINATION SCENARIOS FOR WIRES BEHAVING AS TRANSMISSION LINES: (a)


SERIES TERMINATION AT THE SOURCE, (b) PARALLEL TERMINATION AT THE DESTINATION

The two scenarios- series and parallel termination as shown are depicted in figure. Series
termination requires that the impedance of the signal source is matched to the connecting wire.
This approach is appropriate for many CMOS designs, where the destination load is purely
capacitive. The impedance of the driver inverter can be matched to the line by careful transistor
sizing.

o Shielding
If we want to control the behaviour of a wire behaving as a transmission line, we should
carefully plan and manage how the return current flows. A good example of a well-defined
transmission line is the coaxial cable, where the signal wire is surrounded by a cylindrical
ground plane. To accomplish similar effects on a board or on a chip, designers often surround
the signal wire with ground (supply) planes and shielding wires. Being shielding, adding
shielding makes the behaviour and the delay of an interconnection a lot more predictable. Yet
even with these precautions, powerful extraction and simulation tools will be needed in the
future for the high-performance circuit designer.

81
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
UNIT – IV
INTERCONNECT, MEMORY ARCHITECTURE AND ARITHMETICCIRCUITS
TWO MARK QUESTIONS AND ANSWERS
1. What is meant by data path circuits? (APR 2016)
• Data path circuits are meant for passing the data from one segment to other segment for
processing or storing.
• The data path is the core of processors, where all computations are performed.

2. What is ripple carry adder?


• If n bits are added, then we can get n-bit sum and carry of Cn. Ci= Carry in bit from the
previous column. N bit ripples carry adder needs n full adders with Ci+1 carry out bit.

3. Draw the circuit for 4 bit ripple carry adder. (NOV 2018)

4. Write the equation for total delay in 4 bit ripple carry adder.
The total delay using the following equation,
t4b = td (cin → S3) +2td (cin → cout) + td (a0, b0 → c1)

5. Write the equation for worst case delay in 4 bit ripple carry adder.
If it is extend to n-bit, then the worst case delay is
tn-bit = td(cin → Sn-1) + (n-2)td(cin → cout) +td(a0,b0 → c1)

6. What is meant by Carry Lookahead Adder (CLA)?


• A carry-lookahead adder (CLA) or fast adder is a type of adder used in digital logic.
• A carry-lookahead adder improves speed by reducing the amount of time required to
determine carry bits.
• The carry-lookahead adder calculates one or more carry bits before the sum, which
reduces the wait time to calculate the result of the larger value bits.

7. Write the general expression for carry signal in CLA.


Write the full adder output interms of propagate and generate. (April 2018)
• We can write carry look-ahead expressions in terms of the generate gi and propagate pi
signals. The general form of carry signal ci thus becomes
ci +1 = ai .bi + ci .(ai  bi ) = g i + ci .pi If ai .b =1, then ci +1 = 1,
,
8. Write the equation for generate term in CLA.
• In the case of binary addition, A + B generates if and only if both A and B are 1. If we
write G(A,B) to represent the binary predicate that is true if and only if A + B generates,
write generate term as, g i = ai .bi

82

82
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
9. Write the equation for propagates term in CLA.
• In the case of binary addition, A + B propagates if and only if at least one of A or B is 1.
If we write P(A,B) to represent the binary predicate that is true if and only if A + B
propagates, we have:Write the propagate term as, pi = ai  bi

10. What are the two factors that Carry lookahead adder depends on?
• Carry lookahead depends on two things:
o Calculating, for each digit position, whether that position is going to propagate a
carry if one comes in from the right.
o Combining these calculated values to be able to deduce quickly whether, for each
group of digits, that group is going to propagate a carry that comes in from the
right.

11. Write the generalized equation for CLA.

12. Name the limitations of MODL.


• MODL has following limitations as
i. clocking in mandatory
ii. The output is subject to charge leakage and charge sharing.
iii. Series connected nFET chains can give long discharge times.

13. What is called Manchester Carry Chain Adder?


• The Manchester carry chain is a variation of the carry-lookahead adder that uses shared
logic to lower the transistor count.
• As seen in CLA implementation section, the logic for generating each carry contains all
of the logic used to generate the previous carries.
• A Manchester carry chain generates the intermediate carries by tapping off nodes in the
gate that calculates the most significant carry value.

14. Write the basic equation for Manchester Carry Chain Adder?
Define kill term, propagate and generate term in a carry look ahead adder. (April
2019)
• In this adder, the basic equation is ci +1 = g i + ci .pi
Where pi = ai  bi and g i = ai .bi
• Carry kill bit ki = ai + bi = ai .bi
• If Ki=1, then pi=0 and gi=0. Hence, ki is known as carry kill bit.

15. Draw the switch level circuit for Manchester carry chain adder.

83

83
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
The switch level circuit is given as

16. What are high (wide) adders?


• The adders with more than 4 bits can be designed. This is known as wide or high speed
adders. Brute-force approach can be used to design 8 bit adder.

17. What are the types of high speed adders?


Types of high speed adders are
1. Carry Skip adder
2. Carry Select adder
3. Carry Save adder.

18. What is Carry skip adder?


• Carry skip adder is one of the high speed adders.
• When BP= P0P1P2P3 =1, the incoming carry is forwarded immediately to the next
block.
• Hence the name carry bypass adder or carry skip adder.
• Idea: if (P0 and P1 and P2 and P3 =1) the C03 = C0, else “kill” or “generate”.

19. What is Carry Select adder?


Write the principle of any one fast multiplier. (NOV 2016)
• Adding two n-bit numbers with a carry-select adder is done with two adders in order to
perform the calculation twice.
• After the two results are calculated, the correct sum, as well as the correct carry-out, is
then selected with the multiplexer once the correct carry-in is known.

20. What is Carry save adder?


• In carry save adder, the carry does not propagate. So, it is faster than carry propagate
adder.
• It has three inputs and produces 2 outputs, carry-out is saved. It is not immediately used
to find the final sum value.

21. What are accumulators?


• Accumulator acts as a part of ALU and it is identified as register A. The result of an
operation performed in the ALU is stored in the accumulator.
• It is used to hold the data for manipulation (arithmetic and logical)

84

84
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
22. What are multipliers?
• Multiplier is used in computation process, which multiplies two binary numbers.
• Basic operations in multiplication are given below.
0 x 0 = 0, 0 x 1 = 0, 1 x 0 = 0, 1x1=1

.
23. Draw the truth table of multiplier.
The truth table of multiplier is

24. Mention the steps involved in multiplying by shifting.


• If x=(0010)2 = (2)10
• If it is to be multiplied by 2, then we can shift x in left side. x = (0100)2 = (4)10
• If it is to be divided by 2, then we can shift in right side. x = (0001)2 = (1)10.
• So, shift register can be used for multiplication or division by 2.

25. Write the delay equation for array multiplier.


The equation for array multiplier is

26. State radix-2 booth encoding table. (April 2019)


In radix-2 booth multiplication partial product generation is done based on encoding
which is as given by Table.

Table: Booth encoding table with RADIX-2


27. What is meant by divider circuit?
• Divider circuit is used in arithmetic operation in digital circuits. Dividing is carry out by
repeated subtraction and addition.
28. What are the types of dividers available in VLSI?
There are two types of dividers. They are serial divider and parallel divider.

85

85
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
29. Compare serial divider and parallel divider.
• Serial divider is slow and parallel divider is fast in performance. Array divider is fast
compared with the serial divider. But hardware requirement is increased.
30. What is shift register?
• An n-bit rotation is specified by using the control word R0-n and L/R bit defines a left or
right shifting.
• For example y3 y 2 y 1 y 0 = a3 a2 a1 a0
If it is rotated 1-bit in left side, we get y3 y 2 y 1 y 0 = a2 a1 a0 a3
If it is rotated 1-bit in right side, we get y3 y 2 y 1 y 0 = a0 a3 a2 a1
31. What is meant by Barrel shifter?
• A barrel shifter is a digital circuit that can shift a data word by a specified number of bits
in one clock cycle.
• It can be implemented as a sequence of multiplexers (MUX). The output of one MUX is
connected to the input of the next MUX in a way that depends on the shift distance.
32. Draw the structure of 4 X 4 barrel shifter. (April 2018)

33. What is the area constraint between carry lookahead adder and ripple carry adder?
• The area of a carry lookahead adder is larger than the area of a ripple carry adder.
• Carry lookahead adder are parallel, which requires a larger number of gates and also
results in a larger area.
34. What is the drawback of carry lookahead adder?
• In the carry lookahead adder, need large area because computations are in parallel and
more power is consumed.

35. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for 8 bit.

86

86
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
36. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for 16 bit.

37. Draw the graph between area Vs delay of carry lookahead and ripple carry adder
for32 bit.

38. What is meant by bit – sliced data path organization?(May 2016)


• Datapaths are arranged in a bit sliced organization, instead of operating on single bit
digital signals.
• The data in a processor are arranged in a word based fashion. Bit slices are either
identical or resemble a similar structure for all bits.

39. Determine propagation delay of n-bit carry select adder. (May 2016)
✓ Propagation delay, P of n-bit carry select adder is equal to √2𝑁 where N = N- bit adder

40. Draw and list out the components of data path. (May 2017)
• Data path block consists of arithmetic operation, logical operation, shift operation and
temporary storage of operands.

41. Mention the application of Barrel shift register.


Why is barrel shifter very useful in the designing of arithmetic circuits? (NOV 2016)
• A common usage of a barrel shifter is in the hardware implementation of floating-
point arithmetic.
• For a floating-point add or subtract operation, requires shifting the smaller number to
the right.
• This is done by using the barrel shifter to shift the smaller number to the right by the
difference, in one cycle.

87

87
UNIT-IV –EC3552 VLSI AND CHIP DESIGN
42. What is latency? (Nov 2017)
• Clock latency (or clock insertion delay) is defined as the amount of time taken by the
clock signal in traveling from its source to the sinks.

43. Give the applications of high speed adder. (May 2017)


• CMOS high speed adder adders used in processor, data processing application and
data path application with low power consumption.

44. What is meant by booth multiplier?


• Booth’s algorithm is an efficient hardware implementation of a digital circuit that
multiplies two binary numbers in two’s complement notation.
• Booth multiplication is a fastest technique that allows for smaller, faster
multiplication circuits, by recoding the numbers that are multiplied.

45. What is meant by array multiplier?


• Array multiplier uses an array of cells for calculation.
• Multiplier circuit is based on repeated addition and shifting procedure. Each partial
product is generated by the multiplication of the multiplicand with one multiplier
digit.
• N-1 adders are required where N is the number of multiplier bits.
46. What is Wallace tree multiplier? And give its advantages.
The Wallace tree multiplier output structure is tree basis style. It reduces the number of
components and reduces the area.
47. What are parameters used to characterize the memory?
Parameters used to characterize a memory device are area, power and speed.
48. How can you classify memory based on operation mode?
Classifications of memory based on operation mode are 1. ROM , 2. RAM
49. How can you classify memory based on data storage mode?
Classifications of memory based on data storage mode are 1. Volatile, 2. Non-volatile
50. Define ROM. Give some examples.
ROM is a memory where code is written only one time. Examples are washing machine,
calculator, games etc.
51. What are advantage and disadvantages of programming ROM?
Advantage: basic cell only consists of transistor. No need of connection to any of the
supply voltage.
Disadvantage: As it has pseudo nMOS, it is ratioed logic and consumes static power.

52. What is meant by non-volatile memory?


Non-volatile consists of array of transistors. These are placed on a word line – bit line
grid. We can write the program by enabling or disabling these devices selectively.

53. What is floating gate transistor?


Floating gate transistor is mostly used in all the reprogrammable memories. In floating
gate transistor, extra polysilicon strip is used in between the gate and the channel known
as floating gate.

88

88
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

54. What is RAM? And give types of RAM.


RAM is read and write memory. Types are static and dynamic RAM.

55. Distinguish Static and dynamic RAM.[Nov/Dec 2022]


Static RAM:
SRAM cell needs 6 transistors per bit. Bit line (BL) and inverse Bit Line signals are used
to improve the noise margin during read and write operations.
Dynamic RAM:
Content in the cell can be periodically rewritten through a resistive load, called as refresh
operation. Here cell content is read follow by write operation.

56. Draw the schematic of dynamic edge –triggered register. (Dec. 2016)

57. Design a one transistor DRAM cell. (Nov 2013, April 2015)
Draw a 1-transistor Dynamic RAM cell. (April 2019) [Nov/Dec 2022]

58. Design a three transistors DRAM cell.

59. State the merits of barrel shifter. (Nov 2019)


• It has small area and does not require a decoder
• Logarithmic shifter is more effective for larger shift values in terms of both area
and speed.

60. How to design a high speed adder? (Nov 2017)


• Design of high speed adder using CMOS and transmission gates in submicron.
• Design of high speed adder using parallel adder manner.

89

89
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

61. Mention the different hardware architecture used for multiplier. (Nov 2019)

Hardware architectures for multipliers protected by (a) linear arithmetic codes, (b)
[jxj ; j2xj ] multilinear codes, and (c) multi-modulus multilinear codes.

62. Draw the dot diagram for Wallace tree multiplier. [May 2021]

63. List the categories of memory arrays. [May 2021]


Classification based on operation mode:
1. ROM 2. RAM
Classification based on data storage mode:
It means on how it is stored and how long it remains there.
1. Volatile 2. Non-volatile
Classification based on access method:
1. Random access 2. Non-random access

64. State the need of a sense amplifier in a memory cell. (NOV 2021)[Apr/May 2022]
It senses the low power signals from a bitline that represents a data bit (1 or 0) stored in a
memory cell, and amplify the small voltage swing to recognizable logic levels so the data can be
interpreted properly.

90

90
UNIT-IV –EC3552 VLSI AND CHIP DESIGN

65. What are the building blocks of digital architecture?


These blocks include arithmetic circuits, counters, shift registers, memory arrays, and logic
arrays.
66. Mention the steps for single bit addition.

1-Bit Adder (Half Adder)

• The simplest case arises when two one bit numbers are to be added.
• With one bit, only the numbers 0 and 1 can be represented.
• All possible scenarios can be summarized by the following table:

91

91

You might also like