VLSI Physical Design
VLSI Physical Design
LinkedIn: https://www.linkedin.com/company/learnvlsi
⚫ Recap on Design Flow Website: https://www.sites.google.com/view/learnvlsi
⚫ A quick refresh on synthesis
⚫ Introduction to Physical Design
⚫ Inputs and Outputs for Physical Design
⚫ Floor planning
⚫ Basics of Physical Cells
⚫ Power Planning
⚫ Placement of Memories (Macros)
⚫ Placement of Standard Logic Cells
⚫ Placement of IOs
⚫ Routing
⚫ Optimizations during backend flows
⚫ Layout (GDS2) creation
⚫ Chip finishing (DRC / LVS / Timing Sign-off)
⚫ Tools used for Physical Design
⚫ References
Reference: Book “The VLSI Handbook: Design Principles, Industry and Career Perspectives”, Udit Kumar, Aditya Gupta, Sumit Soman
SoC Design Flow
report timing
Write netlist
What is a library
⚫ Library is a collection of cells which is used for implementing a
design
⚫ Cells in a library
Standard Cell Library
Logic Gates
– Combinational Gates – NAND, NOR, XOR, INV, BUF
– Sequential Gates – Flop, Latches, Clock Gating Cells
IO Cell Library
Power IOs
Digital Ios
Analog Ios
Special Ios
Macro Cell Library
Memory Cells
Hard Macro Cells
Physical Cell Library
Fillers, DECAP cells
TAP Cells, TIE Cells
Different Views of Library Cells
⚫ Different views of libraries
Timing views : Timing models
nldm
ccs
Ecsm
Abstract views : Basic layout of cells
Layout views : Detailed layout of cells
Behavior views : Functional behavior description
⚫ Different Options of cells
Multiple VT cells
HVT, SVT, LVT
Multiple Drive Strengths
Low drive cells
Medium drive cells
High drive cells
Synthesis Optimizations
⚫ Timing based optimization ⚫ Different Timing Paths
Maximize performance Reg to Reg Path
In to Reg Path
⚫ Area based optimization Reg to Out Path
Minimize area In to Out Path
⚫ Power based optimization
Minimize Power
Input D Q D Q Output
C C
Launch Path
Clock Net
Clock
Capture Path
Input Output
In to Out
Reg to Reg Path
tcombo
Q D Q
tcq tsu
C C
Launch Path
Clock Net
Capture Path
Timing Requirement
⚫ Basic Timing Requirement
Data
D Q D Q Data Out
Clock
Launch Clock
Min Delay Max Delay
Capture Clock
In-Out, In-Reg and Out-Reg Path
Reg to Out
In to Reg
Output
Input
D Q D Q Q D Q
D
C C C C
Clock
In to Out
Input Output
Q D Q
D
C C
Synthesis Optimizations - Power
⚫ Static Power
Leakage Power
The power consumed by the devices when the design is turned on.
⚫ Dynamic Power
Internal Power
The power consumed by the gates when the inputs are changing,
but the outputs are not changing
Dynamic Power
The power consumed by the gates when the inputs are changing
and the outputs are also changing
PG Nets NA NA NA No
Clock Nets No No No No
Signal Nets Yes No No No
Introduction to Physical Design
S
Logic Diagram
Truth Table A
A B S Z
1 0 0 1 S S Z
1 0 1 0
Logic Equation B
Z = S.A + S.B
NAND Representation of 2:1 Mux
S S Z
B
A Sequential Design
M1 D Q
C
A
S S Z D Q MOut
M2 D Q
C C
B
MSel D Q
MClk C
CMOS Inverter
Gate
Source Drain
VDD N N
Source
Gate
P – Well
Drain
A Z
Gate
Drain Source Drain
Gate
Source P P
VSS N – Well
CMOS Inverter
Source
VDD
VDD
N
Gate
Source
P – Well
Gate
A
N
Drain
A Z
Drain
Drain
Z
Drain
Gate
Source
P
Gate
VSS
N – Well
Source
VSS
CMOS Inverter Layout
Source
Gate
Drain
A Z A Z A Z
Drain
Gate
Source
VDD
Source Source
VDD
Gate
A Source
B Drain Drain
Z B
Drain
A Gate
Z
Source A Source Drain
B Gate VSS
Source
VSS
MUX Layout
A VDD
B VDD
VDD A
Source Drain
Source Drain
B
VSS
VSS
A
Source Drain
Source Drain
VSS
Need for Floorplan
Bed Room 1
Kitchen
Living Room
Entrance Exit
Bed Room 2
Number of Flops in a SoC
Floorplanning Floorplan
Library Abstract Views Macros Yes Yes No No
DEF
PG Nets NA NA NA No
Library Timing Views
Clock Nets No No No No
Technology Information Signal Nets Yes No No No
Why floorplanning is required
Exit
Kitchen Bath Toilet
Logic Gates Logic
Bedroom 2 Gates
Dining Hall
Hall
Bedroom 1 Memories
IOs
Entrance
3 Sites
Basics of Physical Cells
⚫ Physical Cells are the cells used during physical
implementation of circuits to protect the circuit and meet a
physical need for a circuit.
⚫ No logic function exists for a physical cell
⚫ List of Physical Cells
Well Tap Cell
Endcap cell
Decap Cell
Tie Cell
Antenna Cell
WellTap Cell
⚫ Well Tap Cell
Well tap cells are used prevent latch-up in cmos circuits.
Supply is connected to n-well of Well tap cell
Ground is connected to p-well of well tap cell
Supply
Ground
Latchup
Gate Gate
Bias Source Drain Source Drain
Bias
P+ N+ N+ P P N
N–Well
P – Substrate
Gate A Gate
VSS VDD
Z
P+ N+ N+ P P N
R-Well
R-sub
N–Well
P – Substrate
Latch Up Cont ..
VSS VDD
V in (Gate)
E C
B
C
B
Data
D Q D Q Data Out
C C
Clock
Clock Net
Power Planning
⚫ Power planning is an essential step.
⚫ Power planning is the flow step where the power network is created which is
used to provide power to standard logic cells, IO cells and macros
⚫ Bad power planning will cause voltage drop (IR Drop) and electromigration.
VDD
VDD
PAD
Electromigration
⚫ Electromigration is a gradual shift of metal atoms of a
conductor as a result of the current flowing through the
conductor.
⚫ Can cause an open on a wire or a short with an adjacent wire.
⚫ Even without open or short, EM can cause change in RC
values
Reference : https://www.synopsys.com/glossary/what-is-electromigration.html
Addressing IR Drop & Electromigration
VDD
⚫ How Power Distribution Network is VSS
created ?
PG Ring
PG Stripes
Placement
⚫ Placement is a stage of the physical design flow, where each
instance is given an exact location.
⚫ All the gates and IO cells are placed in the rows created
during floorplanning.
⚫ Cells placement are timing aware and physical aware.
Placement
Special Route Creation
Macro Placement
⚫ Macros are big blocks which needs to
be placed like standard cells (gates) Memories
Memories
PLL
Clock Divider Circuit
⚫ Macros such has memories, can be
moved to the boundary area (edges).
Memories are huge and have many
input and output pins.
Placing them in the middle will reduce
resources to route the logic gates Flash
Increase congestion as they have Analog IPs Mem
many io pins
https://www.eng.biu.ac.il/temanad/files/
2018/12/Lecture-7-Placement.pdf
Placement Regions
⚫ Sometimes due to design requirement or placement congestion, we
help the tool to place certain logic in a certain region.
https://www.eng.biu.ac.il/temanad/files/2018/12/Lecture-7-Placement.pdf
Placement Blockage and Halo
⚫ Placement Blockage and halos are created so that no cells
are placed in that region.
Hard Blockage – No cells can be placed in the defined region
Soft Blockage – Cannot be used during placement but can be
used during optimization.
Partial Blockage – A region where minimal cells can be placed.
Halo – A region around macros, that has no cells and used only
for routing the macro signals.
IO Cell Placement
⚫ Types of IO Cells
Digital IO Cells. Eg : MFIOs
Analog IO Cells.
Power IO Cells. Eg : Supply IOs
Special IO Cells. Eg : LVDS
⚫ IO Row creation
Similar to standard cell rows, IO
Rows are also created to place IO
Cells
IO Row
PG Nets NA NA NA Partial
Floorplan DEF
Clock Nets No No No No
Signal Nets Yes No No No
Clock Tree Synthesis (CTS)
⚫ Where we are now :
The RTL is synthesized to gate level netlist.
The floorplan for the design is completed.
Each and every gate is placed on a site
⚫ Why we need to do CTS ?
Clock nets are considered ideal (0 delay) during synthesis.
Physical nets for the clocks are created during CTS
During CTS, we ensure that all the flops of a clock group receive
the real clock.
CTS is timing aware and require timing constraints to implement
clock tree
Clock Tree Synthesis (CTS)
⚫ Why can’t we route the clock net as any other signal net ?
Clock nets have very high number of sinks. (Drives many flops)
Clock nets run through out the block
Impacts Timing, Power, Area etc.,
Timing Requirement
⚫ Basic Timing Requirement
Data
D Q D Q Data Out
Clock
Launch Clock
Min Delay Max Delay
Capture Clock
Different Clock Parameters
⚫ Clock Skew
Difference in clock arrival time at Q1
two different flops
Clock
⚫ Clock Jitter
Difference in clock period Q2
Clock arrival
reduced timing path
at Q1
Skew
reduced timing path
Clock arrival
at Q2 Clock Jitter
Ideal clock arrival at Q1 and Q2 Flops t1 t2
t1 = t2
Why do clock skew and jitter arise
⚫ Clock Generation
⚫ Clock Distribution Network
Cells in clock network
Variation in transistors in
clock network
Wire length
Coupling effects
Load effects
⚫ Environment Variation
Temperature
Supply Voltage
What does Clock Skew and Jitter cause
⚫ Clock skew and jitter reduces the timing margins
Launch Clock
Capture Clock
Launch Clock
Capture Clock
with +ve skew
+ve Skew
⚫ Clock networks are huge and they are responsible for large portion
of total chip power
Impact on Area
⚫ All the clock elements such as clock generation cells, clock
path cells, clock nets consume huge area as the clock cells
are spread across the chip.
⚫ Clock nets consume large amount of routing resources
Require low RC for transition and power
⚫ Needs clock shielding to avoid noise on clock network
Impact on Signal Integrity
⚫ Noise on the clock network can cause:
In worst case scenario, it can cause additional clock edges
Low coupling can deteriorate clock propagation
Irregular clock edges can cause functional failures
⚫ Slow clock transition
Susceptible to noise
Poor flop performance, causes bad tcq, tsu, th
⚫ Fast clock transitions
Overdesign impacting area, power
Act as aggressor to other signals
⚫ Unbalanced drivers lead to increased skew
Building a Clock Tree
Requirement
⚫ Connect all the clock tree elements (sinks), to the respective
clock network so as to minimize
Clock Skew
Insertion Delay
Wirelength
Noise and Coupling effects
⚫ Challenge
Synchronize millions of separate clock elements within a time
scale of ~10ps
Within a spanning distance of 2-4 cms
Approaches to CTS
⚫ Clock Tree
⚫ Clock Mesh
Clock Trees
⚫ Routing clock net similar to signal net
Route the net to each sink and balance
RC
Consumes too much power
Large RC of net causes signal integrity
issues
⚫ Standard Approach:
Try to build a balanced tree
Clock tree elements are not
balanced evenly
PLL
Clock Concurrent Optimization
What is the main requirement ?
Skew minimization and reducing insertion delay
or
Meeting timing (+ DRV constraints)
⚫ Hence:
We will pre-route the clock nets during CTS
Use higher and thicker metals for clock routing
Offers low resistance
Offers less cap with the substrate
Apply shielding to clock nets
Consider adding DECAPs close to clock buffers
Post CTS Optimizations
⚫ Delay cell insertions
⚫ Sizing of cells in the clock tree
⚫ Buffers re-location
⚫ Useful skew
⚫ Gate re-location
⚫ Buffer re-sizing
PG Nets NA NA NA Partial
A Track 4
Source Drain
PG Nets NA NA NA Yes
Reference : Signoffsemi.com
Timing Sign-Off
1Sec
⚫ Best Case – Worst Case Timing
Setup Checks :
Data Path with max delay
Clock Path with min delay
Hold Checks : Open Open
Data Path with min delay
Clock Path with max delay Closed Closed
tcombo
Q D Q
tcq tsu
C C
Launch Path
Clock Net
Capture Path
On Chip Variation
On Chip Variation
Process/Voltage/Temperature (PVT) variation
can affect different parts of the timing path in
opposite directions
Adjust the worst possible scenarios with derates
Setup Checks :
Data Path with max delay + derate factor
Clock Launch Path with min delay + derate factor
Clock Capture Path with min delay – derate factor
Hold Checks :
Data Path with min delay – derate factor
Clock Launch Path with min delay - derate factor
Clock Capture Path with max delay + derate factor
Timing – Pessimism / Optimism
⚫ If you are too optimistic, your chip may not work
⚫ If you are too pessimistic, it is painful for timing closure
Time-to-market increases
Performance is hindered
Less efficient in performance, power and area
D Q D Q
Launch Path C C
Clock Net
Capture Path
Setup Checks :
Data Path with max delay + derate factor ⚫ Applying CRPR limits the pessimism
Clock Launch Path with min delay + derate factor of OCV
Clock Capture Path with min delay – derate factor
Hold Checks : ⚫ This removes the derating from the
Data Path with min delay – derate factor clock path shared by both lauch and
Clock Launch Path with min delay - derate factor
Clock Capture Path with max delay + derate factor
capture paths
Advanced on-chip variation (AOCV)
https://vlsi-soc.blogspot.com/