0% found this document useful (0 votes)
18 views

Lecture18 Pitfalls

The document discusses sources of variation in CMOS VLSI design like process, voltage, temperature, and aging effects, and outlines reliability issues such as soft errors from radiation and hard errors from electromigration, hot carriers, and dielectric breakdown. It also covers circuit design techniques to mitigate these variations and reliability issues through techniques like guard rings, error correction, and avoiding threshold voltage drops in circuits.

Uploaded by

vin ad
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
18 views

Lecture18 Pitfalls

The document discusses sources of variation in CMOS VLSI design like process, voltage, temperature, and aging effects, and outlines reliability issues such as soft errors from radiation and hard errors from electromigration, hot carriers, and dielectric breakdown. It also covers circuit design techniques to mitigate these variations and reliability issues through techniques like guard rings, error correction, and avoiding threshold voltage drops in circuits.

Uploaded by

vin ad
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 35

CMOS VLSI Design

Lecture 18:
Variation and
Reliability
Learning Objectives
At the end of this lecture, you should be able to:
• Describe sources and effects of on-chip variation due to process, voltage, temperature
and aging.
• Outline the major sources of on-chip noise.
• Outline the differences between soft and hard errors.

2 © 2020 Arm Limited


Variation
• Process
• Threshold
• Channel length
• Interconnect dimensions
• Environment
• Voltage
• Temperature
• Aging/Wearout

3 © 2020 Arm Limited


Process Variation
• Threshold Voltage
• Depends on placement of dopants in channel
• Standard deviation inversely proportional to channel area

• Channel Length [Bernstein06]

• Systematic across-chip linewidth variation (ACLV)


• Random line edge roughness (LER)

• Interconnect
• Etching variations affect w, s, h

Courtesy Texas Instruments

4 © 2020 Arm Limited


Spatial Distribution
• Variations show spatial correlation
•Lot-to-lot (L2L)
• Wafer-to-wafer (W2W)
• Die-to-die (D2D)/inter-die
• Within-die (WID)/intradie
• Closer transistors match better

Courtesy M. Pelgrom

5 © 2020 Arm Limited


Environmental Variation
• Voltage
• VDD is usually designed +/- 10%
• Regulator error
• On-chip droop from
switching activity
• Temperature
• Ambient temperature ranges
• On-die temperature elevated Courtesy IBM

by chip power consumption

[Harris01b]

6 © 2020 Arm Limited


Aging
• Transistors change over time as they wear out
• Hot carriers
• Negative bias temperature instability
• Time-dependent dielectric breakdown
• Causes threshold voltage changes
• More on this later…

7 © 2020 Arm Limited


Process Corners
• Model extremes of process variations in simulation
• Corners
• Typical (T)
• Fast (F)
• Slow (S)
• Factors
• nMOS speed
• pMOS speed
• Wire
• Voltage
• Temperature

8 © 2020 Arm Limited


Corner Checks
• Circuits are simulated in different corners to verify different performance and
correctness specifications

9 © 2020 Arm Limited


Monte Carlo Simulation
• As process variation increases, the worst-case corners become too pessimistic for
practical design
• Monte Carlo: repeated simulations with parameters randomly varied each time
• Look at scatter plot of results to predict yield
• E.g., impact of Vt variation
• ON-current
• leakage

10 © 2020 Arm Limited


Noise
• Sources
• Power supply noise/ground bounce
• Capacitive coupling
• Charge sharing
• Leakage
• Noise feedthrough
• Consequences
• Increased delay (for noise to settle out)
• Or incorrect computations

11 © 2020 Arm Limited


Reliability
• Hard Errors
• Oxide wearout
• Interconnect wearout
• Overvoltage failure
• Latchup
• Soft Errors
• Characterizing reliability
• Mean time between failures (MTBF)
– # of devices × hours of operation / number of failures
• Failures in time (FIT)
– # of failures / thousand hours / million devices

12 © 2020 Arm Limited


Accelerated Lifetime Testing
• Expected reliability typically exceeds 10 years
• But products come to market in 1-2 years
• Accelerated lifetime testing required to predict adequate long-term reliability

[Arnaud08]

13 © 2020 Arm Limited


Hot Carriers
• Electric fields across channel impart high energies to some carriers
• These “hot” carriers may be blasted into the gate oxide where they become trapped
• Accumulation of charge in oxide causes shift in Vt over time
• Eventually, Vt shifts too far for devices to operate correctly
• Choose VDD to achieve reasonable product lifetime
• Worst problems for inverters and NORs with slow input risetime and long propagation delays

14 © 2020 Arm Limited


NBTI
• Negative bias temperature instability
• Electric field applied across oxide forms dangling bonds called traps at Si-SiO2 interface
• Accumulation of traps causes Vt shift
• Most pronounced for pMOS transistors with strong negative bias (Vg = 0; Vs = VDD) at
high temperature

15 © 2020 Arm Limited


TDDB
• Time-dependent dielectric breakdown
• Gradual increase in gate leakage when an electric field is applied across an oxide
• aka stress-induced leakage current
• For 10-year life at 125 C, keep Eox below ~0.7 V/nm

16 © 2020 Arm Limited


Electromigration
• “Electron wind” causes movement of metal atoms along wires
• Excessive electromigration leads to open circuits
• Most significant for unidirectional (DC) current
• Depends on current density Jdc (current/area)
• Exponential dependence on temperature

• Black’s Equation:

Ea
• Typical limits: Jdc < 1 – 2 mA / μm2
e kT
MTTF 
J dc n

[Christiansen06]

17 © 2020 Arm Limited


Electromigration Video

18 © 2020 Arm Limited


Electromigration Video 2

19 © 2020 Arm Limited


Self-Heating
• Current through wire resistance generates heat
• Oxide surrounding wires is a thermal insulator
• Heat tends to build up in wires
• Hotter wires are more resistive, slower
• Self-heating limits AC current densities for reliability

• Typical limits: Jrms < 15 mA / μm2


T

 I (t ) dt
2

I rms  0
T

20 © 2020 Arm Limited


Overvoltage Failure
• High voltages can blow out tiny transistors
• Electrostatic discharge (ESD)
• kilovolts from static electricity when the package pins are handled
• Oxide breakdown
• In a 65 nm process, Vg ≈ 3 V causes arcing through thin gate oxides
• Punchthrough
• High Vds causes depletion region between source and drain to touch, leading to high current flow and
destructive overheating

21 © 2020 Arm Limited


Latchup
• Latchup: positive feedback leading to VDD – GND short
• Major problem for 1970’s CMOS processes before
it was well understood
• Avoid by minimizing resistance of body to GND/VDD
• Use plenty of substrate and well taps

22 © 2020 Arm Limited


Guard Rings
• Latchup risk greatest when diffusion-to-substrate diodes could become forward-biased
• Surround sensitive region with guard ring to collect injected charge

23 © 2020 Arm Limited


Soft Errors
• In 1970’s, DRAMs were observed to randomly flip bits
• Ultimately linked to alpha particles and cosmic ray neutrons
• Collisions with atoms create electron-hole pairs in a substrate
• These carriers are collected on p-n junctions, disturbing the voltage

[Baumann05]

24 © 2020 Arm Limited


Radiation Hardening
• Radiation hardening reduces soft errors
• Increase node capacitance to minimize impact of collected charge
• Or use redundancy
• E.g., dual-interlocked cell

• Error-correcting codes
• Correct for soft errors that do occur

25 © 2020 Arm Limited


Circuit Pitfalls
• Detective puzzle
• Given circuit and symptom, diagnose cause and recommend solution
• All these pitfalls have caused failures in real chips

26 © 2020 Arm Limited


Bad Circuit 1
• Circuit
Symptom
• Mux
2:1 multiplexer
works when selected D is 0 but not 1.
• Or fails at low VDD.
• Or fails in SFSF corner.
S

D0 X
Y
D1

 Principle: Threshold drop


– X never rises above VDD-Vt
– Vt is raised by the body effect
– The threshold drop is most serious as Vt becomes a greater
fraction of VDD.
 Solution: Use transmission gates, not pass transistors

27 © 2020 Arm Limited


Bad Circuit 2
• Circuit
Symptom
• Load
Latcha 0 into Q
• Set f = 0
• Eventually, Q spontaneously flips to 1

X
D Q

 Principle: Leakage
– X is a dynamic node holding value as charge on the node
– Eventually subthreshold leakage may disturb charge
 Solution: Staticize node with feedback  Q

– Or periodically refresh node (requires fast clock,


D
X



not practical processes with big leakage)

28 © 2020 Arm Limited


Bad Circuit 3
• Circuit
Symptom
• Precharge
Domino AND gategate
(Y=0)
• Then, evaluate
• Eventually, Y spontaneously flips to 1

X
Y
0
1

 Principle: Leakage
– X is a dynamic node holding value as charge on the node
– Eventually subthreshold leakage may disturb charge
 Solution: Keeper 
Y
0 X
1

29 © 2020 Arm Limited


Bad Circuit 4
• Circuit
Symptom
• When
Pseudo-nMOS
only oneOR
input is true, Y = 0.
• Perhaps only happens in SF corner.

X
Y
A B

 Principle: Ratio Failure


– nMOS and pMOS fight each other.
– If the pMOS is too strong, nMOS cannot pull X low enough.
 Solution: Check that ratio is satisfied in all corners

30 © 2020 Arm Limited


Bad Circuit 5
• Circuit
Symptom
• Q
Latch
stuck at 1.
• May only happen for certain latches where input is driven by a small gate located far away.


D X Q

 weak

 Principle: Ratio Failure (again)



– Series resistance of D driver, wire
D Q
resistance, and tgate must be much
 weak
less than weak feedback inverter.
stronger
 Solutions: Check relative strengths
– Avoid unbuffered diffusion inputs where driver is unknown

31 © 2020 Arm Limited


Bad Circuit 6
• Circuit
Symptom
• Precharge
Domino AND gate
gate
while
A = B = 0; so Z = 0
• Set f = 1
• A rises 
Y
Z
• Z is observed to sometimes
A rise
X
B

 Principle: Charge Sharing


– If X was low, it shares charge with Y
 Solutions: Limit charge sharing

CY Y
Vx  VY  VDD Z
C x  CY A X CY
– Safe if CY >> CX B Cx

– Or precharge node X too

32 © 2020 Arm Limited


Bad Circuit 7
• Circuit
Symptom
• Precharge
Dynamic gate
gate+ while
latch transmission gate latch is opaque
• Evaluate
• When latch becomes transparent, X falls

X
Y
0

 Principle: Charge Sharing


– If Y was low, it shares charge with X
 Solution: Buffer dynamic nodes before
driving transmission gate

33 © 2020 Arm Limited


Bad Circuit 8
• Circuit
Symptom
• Q
Latch
changes while latch is opaque
• Especially, if D comes from a faraway driver

GND
D VDD Q

VDD
weak

 Principle: Diffusion Input Noise Sensitivity


– If D < -Vt, transmission gate turns on
– Most likely because of power supply noise or coupling on D
 Solution: Buffer D locally 0
VDD Q
D

VDD
weak

34 © 2020 Arm Limited


Summary
• Static CMOS gates are very robust
• Will settle to correct value if you wait long enough
• Other circuits suffer from a variety of pitfalls
• Tradeoff between performance & robustness
• Essential to check circuits for pitfalls
• For large chips, you need an automatic checker.
• Design rules aren’t worth the paper they are printed on unless you back them up with a tool.

35 © 2020 Arm Limited

You might also like