AVLSI MOD1
AVLSI MOD1
MODULE-1
Introduction to ASICs: Full custom, Semi-custom and Programmable ASICs, ASIC Design flow, ASIC cell libraries.
CMOS Logic: Data path Logic Cells: Data Path Elements, Adders: Carry skip, Carry bypass,
Carry save, Carry select, Conditional sum, Multiplier (Booth encoding), Data path Operators, I/O cells,
Cell Compilers.
In a full-custom ASIC an engineer designs some or all of the logic cells, circuits, or layout specifically
for one ASIC.
This means the designer abandons the approach of using pretested and pre-characterized cells for all or
part of that design. It makes sense to take this approach only if there are no suitable existing cell libraries
available that can be used for the entire design.
This might be because existing cell libraries are not fast enough, or the logic cells are not small enough
or consume too much power. You may need to use full-custom design if the ASIC technology is new or so
specialized that there are no existing cell libraries or because the ASIC is so specialized that some circuits must
be custom designed.
Fewer and fewer full-custom ICs are being designed because of the problems with these special parts of
the ASIC. There is one growing member of this family, though, the mixed analog/digital ASIC.
Bipolar technology has historically been used for precision analog functions. There are some
fundamental reasons for this. In all integrated circuits the matching of component characteristics between chips
is very poor, while the matching of characteristics between components on the same chip is excellent.
Suppose we have transistors T1, T2, and T3 on an analog/digital ASIC. The three transistors are all the
same size and are constructed in an identical fashion. Transistors T1 and T2 are located adjacent to each other
and have the same orientation. Transistor T3 is the same size as T1 and T2 but is located on the other side of the
chip from T1 and T2 and has a different orientation. ICs are made in batches called wafer lots.
(A wafer lot is a group of silicon wafers that are all processed together. Usually there are between 5 and
30 wafers in a lot. Each wafer can contain tens or hundreds of chips depending on the size of the IC and the
wafer)
If we were to make measurements of the characteristics of transistors T1, T2, and T3 we would find the
following:
1) Transistors T1 will have virtually identical characteristics to T2 on the same IC. We say that the transistors
match well or the tracking between devices is excellent.
2) Transistor T3 will match transistors T1 and T2 on the same IC very well, but not as closely as T1 matches
T2 on the same IC.
3) Transistor T1, T2, and T3 will match fairly well with transistors T1, T2, and T3 on a different IC on the
same wafer. The matching will depend on how far apart the two ICs are on the wafer.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 1
Advanced VLSI (21EC71)
4) Transistors on ICs from different wafers in the same wafer lot will not match very well.
5) Transistors on ICs from different wafer lots will match very poorly.
For many analog designs the close matching of transistors is crucial to circuit operation. For these circuit
designs pairs of transistors are used, located adjacent to each other. Device physics dictates that a pair of bipolar
transistors will always match more precisely than CMOS transistors of a comparable size. Bipolar technology
has historically been more widely used for full-custom analog design because of its improved precision. Despite
its poorer analog properties, the use of CMOS technology for analog functions is increasing. There are two
reasons for this. The first reason is that CMOS is now by far the most widely available IC technology. Many
more CMOS ASICs and CMOS standard products are now
1.2 Semi-Custom ASIC Design:
A semi-custom ASIC (Application-Specific Integrated Circuit) design strikes a balance between the
flexibility of fully custom ASICs and the cost-effectiveness of standard ASIC designs.
What is Semi-Custom ASIC Design?
Semi-Custom ASICs use pre-designed building blocks or modules, known as standard cells, which are
then customized to meet specific requirements. This approach leverages predefined circuit elements and allows
for modifications and integrations to create a chip that meets particular needs without the high cost and time
associated with fully custom designs.
1.2.1 Standard-Cell Based ASICs:
A cell-based ASIC uses predesigned logic cells (AND gates, OR gates, multiplexers, and flip-flops, for
example) known as standard cells. We could apply the term CBIC to any IC that uses cells, but it is generally
accepted that a cell-based ASIC.
The standard-cell areas (also called flexible blocks) in a CBIC are built of rows of standard cells like a
wall built of bricks. The standard-cell areas may be used in combination with larger predesigned cells, perhaps
microcontrollers or even microprocessors, known as mega cells. Mega cells are also called mega functions, full-
custom blocks, system-level macros (SLMs), fixed blocks, cores, or Functional Standard Blocks (FSBs).
The ASIC designer defines only the placement of the standard cells and the interconnect in a CBIC.
However, the standard cells can be placed anywhere on the silicon; this means that all the mask layers of a
CBIC are customized and are unique to a customer. The advantage of CBICs is that designers save time,
money, and reduce risk by using a predesigned, pretested, and recharacterized standard-cell library. In addition,
each standard cell can be optimized individually. During the design of the cell library each transistor in every
standard cell can be chosen to maximize speed or minimize area.
Figure 1.2.1.1 shows a CBIC.
The important features of this type of ASIC are as follows:
1) All mask layers are customized transistors and interconnect.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 2
Advanced VLSI (21EC71)
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 3
Advanced VLSI (21EC71)
Power supplies (labelled VDD and GND) run horizontally inside a standard cell on a metal layer that lies above
the transistor layers. Each different shaded and labelled pattern represents a different layer. This standard cell
has centre connectors (the three squares, labelled A1, B1, and Z) that allow the cell to connect to others. The
layout was drawn using ROSE, a symbolic layout editor developed by Rockwell and Compass, and then
imported into Tanner Research’s L-Edit.
Fig. 1.2.1.3 Two feedthroughs: one in cell A.14 and one in cell A.23
A connection that needs to cross over a row of standard cells uses a feedthrough. The term feedthrough
can refer either to the piece of metal that is used to pass a signal through a cell or to a space in a cell waiting to
be used as a feedthrough very confusing. Figure 1.2.1.3 shows two feedthroughs: one in cell A.14 and one in
cell A.23.
In both two-level and three-level metal technology, the power buses (VDD and GND) inside the
standard cells normally use the lowest (closest to the transistors) layer of metal (metal1). The width of each row
of standard cells is adjusted so that they may be aligned using spacer cells.
The power buses, or rails, are then connected to additional vertical power rails using row-end cells at the
aligned ends of each standard-cell block. If the rows of standard cells are long, then vertical power rails can also
be run in metal2 through the cell rows using special power cells that just connect to VDD and GND. Usually the
designer manually controls the number and width of the vertical power rails connected to the standard-cell
blocks during physical design.
1.2.2 Gate-Array Based ASICs:
In a gate array (sometimes abbreviated to GA) or gate-array based ASIC the transistors are predefined
on the silicon wafer. The predefined pattern of transistors on a gate array is the base array, and the smallest
element that is replicated to make the base array (tiles on a floor) is the base cell (sometimes called a primitive
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 4
Advanced VLSI (21EC71)
cell). Only the top few layers of metal, which define the interconnect between transistors, are defined by the
designer using custom masks. To distinguish this type of gate array from other types of gate array, it is often
called a masked gate array (MGA). The designer chooses from a gate-array library of predesigned and
recharacterized logic cells.
The logic cells in a gate-array library are often called macros. The reason for this is that the base-cell
layout is the same for each logic cell, and only the interconnect (inside cells and between cells) is customized,
so that there is a similarity between gate-array macros and a software macro. Inside IBM, gate-array macros are
known as books (so that books are part of a library), but unfortunately this descriptive term is not very widely
used outside IBM.
There are the following different types of MGA or gate-array based ASICs:
1) Channelled gate arrays.
2) Channel less gate arrays.
3) Structured gate arrays.
1.2.2.1 Channelled Gate Array:
Fig. 1.2.2.1 A channelled gate-array die. The spaces between rows of the base cells are set aside for interconnect
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 5
Advanced VLSI (21EC71)
• Only some (the top few) mask layers are customized the interconnect.
• Manufacturing lead time is between two days and two weeks.
Fig. 1.2.2.2 A channel less gate-array or sea-of-gates (SOG) array die. The core area of the die is filled with an array of base cells.
The key difference between a channel less gate array and channelled gate array is that there are no
predefined areas set aside for routing between cells on a channel less gate array. Instead we route over the top of
the gate-array devices.
We can do this because we customize the contact layer that defines the connections between metal-1, the
first layer of metal, and the transistors. When we use an area of transistors for routing in a channel less array, we
do not make any contacts to the devices lying underneath; we simply leave the transistors unused.
The logic density the amount of logic that can be implemented in a given silicon area is higher for
channel less gate arrays than for channelled gate arrays. This is usually attributed to the difference in structure
between the two types of array. In fact, the difference occurs because the contact mask is customized in a
channel less gate array but is not usually customized in a channelled gate array.
This leads to denser cells in the channel less architectures. Customizing the contact layer in a channel
less gate array allows us to increase the density of gate-array cells because we can route over the top of unused
contact sites.
1.2.2.3 Structured Gate Array:
An embedded gate array or structured gate array (also known as master slice or master image) combines
some of the features of CBICs and MGAs. One of the disadvantages of the MGA is the fixed gate-array base
cell. This makes the implementation of memory, for example, difficult and inefficient.
In an embedded gate array, we set aside some of the IC area and dedicate it to a specific function. This
embedded area either can contain a different base cell that is more suitable for building memory cells, or it can
contain a complete circuit block, such as a microcontroller.
Figure 1.2.2.3 shows an embedded gate array. The important features of this type of MGA are the
following:
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 6
Advanced VLSI (21EC71)
Fig. 1.2.2.3 A structured or embedded gate-array die showing an embedded block in the upper left corner (a static random-access
memory, for example). The rest of the die is filled with an array of base cells.
An embedded gate array gives the improved area efficiency and increased performance of a CBIC but
with the lower cost and faster turnaround of an MGA. One disadvantage of an embedded gate array is that the
embedded function is fixed.
For example, if an embedded gate array contains an area set aside for a 32 k-bit memory, but we only
need a 16 k-bit memory, then we may have to waste half of the embedded memory function. However, this may
still be more efficient and cheaper than implementing a 32 k-bit memory using macros on a SOG array.
ASIC vendors may offer several embedded gate array structures containing different memory types and
sizes as well as a variety of embedded functions.
ASIC companies wishing to offer a wide range of embedded functions must ensure that enough
customers use each different embedded gate array to give the cost advantages over a custom gate array or CBIC
(the Sun Microsystems SPARCstation 1 described in Section 1.3 made use of LSI Logic embedded gate arrays
and the 10K and 100K series of embedded gate arrays were two of LSI Logic’s most successful products).
1.3 Programmable ASICs
Programmable ASICs, also known as Programmable Logic Devices (PLDs), are integrated circuits that
can be configured or programmed to perform specific tasks or functions after manufacturing. This
programmability provides flexibility and customization options for various applications.
Examples: FPGA, CPLD, PSoC, PLD etc
Programmable ASICs offer a powerful solution for many applications where flexibility, rapid
prototyping, and cost-effectiveness are important. They allow designers to implement custom digital functions
without the need for custom silicon, making them valuable tools in a wide range of industries.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 7
Advanced VLSI (21EC71)
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 8
Advanced VLSI (21EC71)
The first choice, using an ASIC-vendor library, requires you to use a set of design tools approved by the
ASIC vendor to enter and simulate your design. You have to buy the tools, and the cost of the cell library is
folded into the NRE. Some ASIC vendors (especially for MGAs) supply tools that they have developed in-
house. `
An ASIC vendor library is normally a phantom library the cells are empty boxes, or phantoms, but
contain enough information for layout. After your complete layout you hand off a netlist to the ASIC vendor,
Who fills in the empty boxes (phantom instantiation) before manufacturing your chip?
The second and third choices require you to make a buy-or-build decision. If you complete an ASIC
design using a cell library that you bought, you also own the masks (the tooling) that are used to manufacture
your ASIC. This is called customer-owned tooling (COT) a library vendor normally develops a cell library
using information about a process supplied by an ASIC foundry.
An ASIC foundry (in contrast to an ASIC vendor) only provides manufacturing, with no design help. If
the cell library meets the foundry specifications, we call this a qualified cell library.
These cell libraries are normally expensive (possibly several hundred thousand dollars), but if a library
is qualified at several foundries this allows you to shop around for the most attractive terms. This means that
buying an expensive library can be cheaper in the long run than the other solutions for high-volume production.
The third choice is to develop a cell library in-house. Many large computer and electronics companies
make this choice. Most of the cell libraries designed today is still developed in-house despite the fact that the
process of library development is complex and very expensive.
However, created, each cell in an ASIC cell library must contain the following:
➢ A physical layout
➢ A behavioral model
➢ A Verilog/VHDL model
➢ A detailed timing models
➢ A test strategy
➢ A circuit schematic
➢ A cell icon
➢ A wire-load model
➢ A routing models
For MGA and CBIC cell libraries we need to complete cell design and cell layout. The ASIC designer
may not actually see the layout if it is hidden inside a phantom, but the layout will be needed eventually.
In a programmable ASIC the cell layout is part of the programmable ASIC design. The ASIC designer
needs a high-level, behavioral model for each cell because simulation at the detailed timing level takes too long
for a complete ASIC design. For a NAND gate a behavioral model is simple. A multiport RAM model can be
very complex. The designer may require Verilog and VHDL models in addition to the models for a particular
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 9
Advanced VLSI (21EC71)
logic simulator. ASIC designers also need a detailed timing model for each cell to determine the performance of
the critical pieces of an ASIC. It is too difficult, too time-consuming, and too expensive to build every cell in
silicon and measure the cell delays. Instead library engineers simulate the delay of each cell, a process known as
characterization. Characterizing a standard-cell or gate-array library involves circuit extraction from the full-
custom cell layout for each cell. The extracted schematic includes all the parasitic resistance and capacitance
elements.
Then library engineers perform a simulation of each cell including the parasitic elements to determine
the switching delays. The simulation models for the transistors are derived from measurements on special chips
included on a wafer called process control monitors (PCMs) or drop-ins. Library engineers then use the results
of the circuit simulation to generate detailed timing models for logic simulation.
All ASICs need to be production tested (programmable ASICs may be tested by the manufacturer before
they are customized, but they still need to be tested). Simple cells in small or medium-size blocks can be tested
using automated techniques, but large blocks such as RAM or multipliers need a planned strategy.
The cell schematic (a netlist description) describes each cell so that the cell designer can perform
simulation for complex cells. You may not need the detailed cell schematic for all cells, but you need enough
information to compare what you think is on the silicon (the schematic) with what is actually on the silicon (the
layout) this is a layout versus schematic (LVS) check.
If the ASIC designer uses schematic entry, each cell needs a cell icon together with connector and
naming information that can be used by design tools from different vendors. We shall cover ASIC design using
schematic entry in Chapter 9. One of the advantages of using logic synthesis rather than schematic design entry
is eliminating the problems with icons, connectors, and cell names. Logic synthesis also makes moving an
ASIC between different cell libraries, or retargeting, much easier.
In order to estimate the parasitic capacitance of wires before we actually complete any routing, we need
a statistical estimate of the capacitance for a net in a given size circuit block. This usually takes the form of a
look-up table known as a wire-load model. We also need a routing model for each cell. Large cells are too
complex for the physical design or layout tools to handle directly and we need a simpler representation phantom
of the physical layout that still contains all the necessary information. The phantom may include information
that tells the automated routing tool where it can and cannot place wires over the cell, as well as the location and
types of the connections to the cell.
1.4 CMOS Logic:
A CMOS transistor (or device) has three terminals: gate, source, drain. A CMOS transistor is a switch.
The switch must be conducting or on to allow current to flow between the source and drain terminals (using
open and closed for switches is confusing for the same reason we say a tap is on and not that it is closed). The
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 10
Advanced VLSI (21EC71)
transistor source and drain terminals are equivalent as far as digital signals are concerned we do not worry about
labeling an electrical switch with two terminals.
1.4.1 Data path Logic Cells:
Suppose we wish to build an n -bit adder (that adds two n -bit numbers) and to exploit the regularity of
this function in the layout. We can do so using a data path structure. The following two functions, SUM and
COUT, implement the sum and carry out for a full adder (FA) with two data inputs (A, B) and a carry in, CIN:
SUM = A • B • CIN = SUM (A, B, CIN) = PARITY (A, B, CIN)
COUT = A · B + A · CIN + B · CIN = MAJ (A, B, CIN)
The sum uses the parity function ('1' if there are an odd number of '1's in the inputs). The carry out,
COUT, uses the 2-of-3 majority function ('1' if the majority of the inputs are '1'). We can combine these two
functions in a single FA logic cell, ADD (A[ i ], B[ I ], CIN, S[ i ], COUT), shown in Figure 1.4.1, where
S[ i ] = SUM (A[ i ], B[ i ], CIN)
COUT = MAJ (A[ i ], B[ i ], CIN)
• Parity function ('1' for an odd number of '1's)
Majority function ('1' if most of the inputs are '1')
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 11
Advanced VLSI (21EC71)
Fig 1.4.1.1 (a) A data bus is shown by a heavy line (1.5 point) and a bus symbol. If the bus is n -bits wide then MSB = n – 1. (b) An
alternative symbol for an adder. (c) Control signals are shown as lightweight (0.5 point) lines.
A typical datapath symbols for an adder (people rarely use the IEEE standards in ASIC datapath
libraries). I use heavy lines (they are 1.5 point wide) with a stroke to denote a data bus (that flows in the
horizontal direction in a datapath), and regular lines (0.5 point) to denote the control signals (that flow vertically
in a datapath). At the risk of adding confusion where there is none, this stroke to indicate a data bus has nothing
to do with mixed-logic conventions. For a bus, A [31:0] denotes a 32-bit bus with A [31] as the leftmost or
most-significant bit or MSB, and A[0] as the least-significant bit or LSB . Sometimes we shall use A[MSB] or
A[LSB] to refer to these bits. Notice that if we have an n -bit bus and LSB = 0, then MSB = n – 1. Also, for
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 12
Advanced VLSI (21EC71)
example, A [4] is the fifth bit on the bus (from the LSB). We use a ' S ' or 'ADD' inside the symbol to denote an
adder instead of '+', so we can attach '–' or '+/–' to the inputs for a subtractor or adder/subtractor.
Some schematic datapath symbols include only data signals and omit the control signals—but we must
not forget them. In Figure (C), for example, we may need to explicitly tie CIN [0] to VSS and use COUT[MSB]
and COUT [MSB – 1] to detect overflow.
1.5 Adders:
We can view addition in terms of generate, G[i], and propagate, P[i], signals
Where C[i] is the carry-out signal from stage i , equal to the carry in of stage (i + 1). Thus, C[i]= COUT[i] =
CIN[i + 1]. We need to be careful because C[0] might represent either the carry in or the carry out of the LSB
stage. For an adder we set the carry in to the first stage (stage zero), C[–1] or CIN[0], to '0'.
If we consider a conventional RCA. The delay of an n -bit RCA is proportional to n and is limited by the
propagation of the carry signal through all of the stages. We can reduce delay by using pairs of “go-faster”
bubbles to change AND and OR gates to fast two-input NAND gates as shown in Figure (a). Alternatively, we
can write the equations for the carry signal in two different ways:
(or)
Fig 1.5. The carry-save adder (CSA). (a) A CSA cell. (b) A 4-bit CSA. (c) Symbol for a CSA. (d) A four-input CSA. (e) The
datapath for a four-input, 4-bit adder using CSAs with a ripple-carry adder (RCA) as the final stage. (f) A pipelined adder. (g) The
datapath for the pipelined version showing the pipeline registers as well as the clock control lines that use m2.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 13
Advanced VLSI (21EC71)
Adders based on this principle are called carry-bypass adders (CBA). Large, custom adders employ Manchester-
carry chains to compute the carries and the bypass operation using TGs or just pass transistors. These types of
carry chains may be part of a predesigned ASIC adder cell, but are not used by ASIC designers.
1.5.3 Carry Skip Adder:
Instead of checking the propagate signals we can check the inputs. For example we can compute
SKIP = (A[ i – 1] ⊕ B[ i – 1]) + (A[ i] ⊕ B[ i ] ) and then use a 2:1 MUX to select C[ i ]. Thus,
This is a carry-skip adder. Carry-bypass and carry-skip adders may include redundant logic (since the carry is
computed in two different ways—we just take the first signal to arrive). We must be careful that the
redundant logic is not optimized away during logic synthesis.
1.5.4 Carry Look Ahead Adder:
If we find the recursive carries to look ahead the possibilities of carry then it is easier for Computation.
The following equation represents the Carry look ahead adder for 4 bits. C[0]=Cin
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 14
Advanced VLSI (21EC71)
Fig 1.5.4 The Brent–Kung carry-lookahead adder (CLA). (a) Carry generation in a 4-bit CLA. (b) A cell to generate the lookahead
terms, C[0]–C[3]. (c) Cells L1, L2, and L3 are rearranged into a tree that has less delay. Cell L4 is added to calculate C[2] that is
lost in the translation. (d) and (e) Simplified representations of parts a and c. (f) The lookahead logic for an 8-bit adder. The
inputs, 0–7, are the propagate and carry terms formed from the inputs to the adder. (g) An 8-bit Brent–Kung CLA.
The outputs of the look ahead logic are the carry bits that (together with the inputs) form the sum. One
advantage of this adder is that delays from the inputs to the outputs are more nearly equal than in other adders.
This tends to reduce the number of unwanted and unnecessary switching events and thus reduces power
dissipation.
1.5.5 Carry Select adder:
In a carry-select adder we duplicate two small adders (usually 4-bit or 8-bit adders—often CLAs) for the
cases CIN = '0' and CIN = '1' and then use a MUX to select the case that we need—wasteful, but fast. A carry-
select adder is often used as the fast adder in a datapath library because its layout is regular.
We can use the carry-select, carry-bypass, and carry-skip architectures to split a 12-bit adder, for
example, into three blocks. The delay of the adder is then partly dependent on the delays of the MUX between
each block. Suppose the delay due to 1-bit in an adder block (we shall call this a bit delay) is approximately
equal to the MUX delay. In this case may be faster to make the blocks 3, 4, and 5-bits long instead of being
equal in size. Now the delays into the final MUX are equal—3 bit-delays plus 2 MUX delays for the carry
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 15
Advanced VLSI (21EC71)
signal from bits 0–6 and 5 bit-delays for the carry from bits 7–11. Adjusting the block size reduces the delay of
large adders (more than 16 bits).
Figure 1.5.6 the simplest form of an n -bit conditional-sum adder that uses n single-bit conditional adders, H (each with
four outputs: two conditional sums, true carry, and complement carry), together with a tree of 2:1 MUXes (Qi_j). The conditional-
sum adder is usually the fastest of all the adders we have discussed.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 16
Advanced VLSI (21EC71)
1.6 Multipliers:
The Booth multiplication algorithm is a technique used in computer architecture to efficiently multiply
binary numbers. It was developed by Andrew Donald Booth in 1951 and has since become a fundamental
component of many processor designs.
Advantages: Faster than traditional multiplication: Booth's algorithm is faster than traditional
multiplication methods, requiring fewer steps to produce the same result.
Figure below shows a symmetric 6-bit array multiplier (an n -bit multiplier multiplies two n -bit numbers; we
shall use n -bit by m -bit multiplier if the lengths are different). Adders a0–f0 may be eliminated, which then
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 17
Advanced VLSI (21EC71)
eliminates adders a1–a6, leaving an asymmetric CSA array of 30 (5 × 6) adders (including one half adder). An n
-bit array multiplier has a delay proportional to n plus the delay of the CPA.
There are two items we can attack to improve the performance of a multiplier:
1. The number of partial products and
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 18
Advanced VLSI (21EC71)
We can recode (or encode) any binary number, B, as a CSD vector, D, as follows (canonical means there
is only one CSD vector for any number):
D i = B i + C i – 2C i + 1
where C i + 1 is the carry from the sum of B i + 1 + B i + C i (we start with C 0 = 0).
Fig. 1.6.1 Tree-based multiplication. (a) The portion of above Figure that calculates the sum bit, P 5 , using a chain of
adders (cells a0–f5). (b) We can collapse this chain to a Wallace tree (cells 5.1–5.5). (c) The stages of multiplication
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 19
Advanced VLSI (21EC71)
Fig. 1.7 Symbols for datapath elements. (a) An array or vector of flip-flops (a register). (b) A two-input NAND cell with databus
inputs. (c) A two-input NAND cell with a control input. (d) A buswide MUX. (e) An incrementer/decrementer. (f) An all-zeros
detector. (g) An all-ones detector. (h) An adder/subtracter.
1.8 I/O Cells:
A three-state bidirectional output buffer (Tri-State ® is a registered trademark of National
Semiconductor). When the output enables (OE) signal is high, the circuit functions as a noninverting buffer
driving the value of DATAin onto the I/O pad. When OE is low, the output transistors or drivers , M1 and M2,
are disconnected. This allows multiple drivers to be connected on a bus. It is up to the designer to make sure
that a bus never has two drivers—a problem known as contention.
To prevent the problem opposite to contention—a bus floating to an intermediate voltage when there are
no bus drivers—we can use a bus keeper or bus-hold cell (TI calls this Bus-Friendly logic). A bus keeper
normally acts like two weak (low drive-strength) cross-coupled inverters that act as a latch to retain the last
logic state on the bus, but the latch is weak enough that it may be driven easily to the opposite state. Even
though bus keepers act like latches, and will simulate like latches, they should not be used as latches, since their
drive strength is weak.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 20
Advanced VLSI (21EC71)
The three-state buffer allows us to employ the same pad for input and output— bidirectional I/O .
When we want to use the pad as an input, we set OE low and take the data from DATA in. Of course, it is not
necessary to have all these features on every pad: We can build output-only or input-only pads.
1.9 Cell Compiler:
The process of hand crafting circuits and layout for a full-custom IC is a tedious, time-consuming, and error-
prone task.
There are two types of automated layout assembly tools, often known as a silicon compiler.
1. The first type produces a specific kind of circuit, a RAM compiler or multiplier compiler etc….
2. The second type of compiler is more flexible, usually providing a programming language
that assembles or tiles layout from an input command file, but this is full-custom IC design. We can
build a register file from latches or flip-flops, but, at 4.5–6.5 gates (18–26 transistors) per bit, this is an
expensive way to build memory. Dynamic RAM (DRAM) can use a cell with only one transistor, storing charge
on a capacitor that has to be periodically refreshed as the charge leaks away. ASIC RAM is invariably static
(SRAM), so we do not need to refresh the bits.
When we refer to RAM in an ASIC environment we almost always mean SRAM. Most ASIC RAMs use
a six-transistor cell (four transistors to form two cross-coupled inverters that form the storage loop, and two
more transistors to allow us to read from and write to the cell). RAM compilers are available that produce
single-port RAM (a single shared bus for read and write) as well as dual-port RAMs, and multiport RAMs . In a
multi-port RAM the compiler may or may not handle the problem of address contention (attempts to read and
write to the same RAM address simultaneously).
RAM can be asynchronous (the read and write cycles are triggered by control and/or address transitions
asynchronous to a clock) or synchronous (using the system clock).
In addition to producing layout we also need a model compiler so that we can verify the circuit at the
behavioural level, and we need a netlist from a netlist compiler so that we can simulate the circuit and verify
that it works correctly at the structural level. Silicon compilers are thus complex pieces of software. We assume
that a silicon compiler will produce working silicon even if every configuration has not been tested. This is still
ASIC design, but now we are relying on the fact that the tool works correctly and therefore the compiled blocks
are correct by construction.
Prof. MANJUNATH E., Dept. of ECE, Dr. TTIT, KGF Page No: 21