Lecture 13,14
Lecture 13,14
CS G553
‹#›
Lecture – 13,14
Reconfigurable Computing Device: Circuit Design of FPGA
fabrics, Architecture of FPGA Fabrics
CS G553 2
Circuit Design of FPGAs
CS G553 3
Topics
CS G553 4
LEs vs. Logic Gates
CS G553 5
Multiplexers as Logic Elements
Q
1
CLR
A
01
CLR
0
D (AB)’
latch
A^B
D
A
CLR
01
0B
CLK
0
CS G553 6
Using Antifuses
CS G553 7
Static CMOS Gate vs. LUT
Number of transistors:
o NAND/NOR gate has 2n transistors.
o 4-input LUT has 128 transistors in SRAM, 96 in multiplexer.
Delay:
o 4-input NAND gate has 9t delay.
o SRAM decoding has 21t delay.
Power:
o Static gate’s power depends on activity.
o SRAM always burns power.
CS G553 8
Lookup Table Circuitry
Demultiplexer or multiplexer?
adrs
adrs
LUT LUT
CS G553 9
Traditional RAM/ROM
Bit line
adrs
CS G553 10
Lookup Memory
CS G553 11
Multiplexer Styles
static gates
pass transistors
This choice depends on the size of the lookup table
CS G553 12
Multiplexer Design
CS G553 13
Static Gate Four-Input Mux
CS G553 14
Tree-based four-input mux
Delay proportional to
square of path length.
Delay grows as lg b2.
CS G553 15
Delay Through an RC Transmission Line
1 d 2V dV
= c
r dx 2 dt
CS G553 16
Delay Through an RC Transmission Line
Elmore Delay
o Elmore defined the delay through a linear network as the first
moment of the impulse response of the network
E = tVout (t )dt
0
CS G553 17
Delay Through an RC Transmission Line
Elmore Delay
o Elmore delay can be computed by taking the sum of RC products,
where each resistance is multiplied by the sum of all the
downstream capacitors
n
1
E = r (n − i )c = rc n(n − 1)
i =1 2
CS G553 18
Pass-Transistor-Based Four-Input Mux
CS G553 19
Transmission Gates based MUX
CS G553 20
LE Output Drivers
CS G553 21
Avoiding Programming Hazards
From LE
config
progb
Routing channel
CS G553 22
Interconnect circuits
CS G553 23
Styles of Programmable Interconnection
Point
Researchers concluded that most of the area in an SRAM-based FPGA is consumed
by the routing switches, so we must carefully design the programmable interconnect
to make best use of the available area
CS G553 24
Pass Transistor Programmable Interconnect
Point
Small area.
Resistive switch.
Delay grows as the square
of the number of switches.
CS G553 25
Three-state Programmable Interconnection
Point
Larger area.
Regenerative driver.
CS G553 26
Switch Area * Wire Delay vs. Pass Transistor
Width (Betz & Rose)
© 1999 IEEE
CS G553 27
Switch Area * Wire Delay vs. Buffer Size
(Betz & Rose)
© 1999 IEEE
CS G553 28
Wire Delay vs. Switch Sizes (Chandra and
Schmit)
© 2002 IEEE
CS G553 29
Clock Drivers
CS G553 30
Clock Nets
CS G553 31
H tree
Regular layout
structure.
o Recursive.
CS G553 32
Architecture of FPGA Fabrics
CS G553 33
Topics
Architecture of FPGA
o Logic elements
o Interconnect
o Pins
CS G553 34
Architectural Issues
How many logic elements in the FPGA?
LE structure:
o What functions?
o How many inputs?
o Dedicated logic?
What types of interconnect?
o How much of each type?
How long should interconnect segments be?
How should we vary interconnect?
o Uniform or non-uniform over chip?
CS G553 35
FPGA Architecture Evaluation Methodology
FPGA Logic
fabric benchmarks
architecture
Place +
route
Area and
performance
evaluation
metrics
CS G553 36
Evaluation Metrics
Structural
o Size of the logic element
o Size of interconnect
Mapping-related
o Logic utilization
o Interconnect utilization
o Delay
CS G553 37
Logic Element Parameters
CS G553 38
Styles of FPGA Interconnect
Local
Intermediate
Global
o clock
o signal
CS G553 39
Interconnect Paths
channel
LE
SW channel
LE channel
channel SW
CS G553 40
Pinout
CS G553 41
Rent’s Rule
CS G553 42
FPGAs and Pins
CS G553 43
The End
Questions ?
CS G553 44