0% found this document useful (0 votes)
22 views

Lecture 13,14

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
22 views

Lecture 13,14

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 44

Reconfigurable Computing

CS G553

Dr. A. Amalin Prince


BITS - Pilani K K Birla Goa Campus
Department of Electrical and Electronics Engineering

‹#›
Lecture – 13,14
Reconfigurable Computing Device: Circuit Design of FPGA
fabrics, Architecture of FPGA Fabrics

CS G553 2
Circuit Design of FPGAs

CS G553 3
Topics

 Circuit design for FPGAs:


o Logic elements.
o Interconnect.

CS G553 4
LEs vs. Logic Gates

 LE is more complex than standard CMOS gate


o CMOS – one function
o LE – number of functions
 Antifuse-based FPGA’s program their logic elements by
connecting various signals, either constants or variables to
the input of the logic elements

CS G553 5
Multiplexers as Logic Elements

Q
1

CLR
A

01

CLR
0
D (AB)’
latch
A^B
D
A
CLR

01
0B
CLK
0

CS G553 6
Using Antifuses

CS G553 7
Static CMOS Gate vs. LUT

 Number of transistors:
o NAND/NOR gate has 2n transistors.
o 4-input LUT has 128 transistors in SRAM, 96 in multiplexer.
 Delay:
o 4-input NAND gate has 9t delay.
o SRAM decoding has 21t delay.
 Power:
o Static gate’s power depends on activity.
o SRAM always burns power.

Because the logic element is so complex, its design requires


careful attention to circuit characteristics

CS G553 8
Lookup Table Circuitry

 Demultiplexer or multiplexer?
adrs
adrs

LUT LUT

CS G553 9
Traditional RAM/ROM

 Cell drives long bit line:

Bit line
adrs

CS G553 10
Lookup Memory

 Multiplexer presents smaller load to memory cells.


o Allows smaller memory cells.

CS G553 11
Multiplexer Styles

Should that multiplexer be made of static gates or pass transistor ?

static gates
pass transistors
This choice depends on the size of the lookup table
CS G553 12
Multiplexer Design

 Pass transistor multiplexer uses fewer transistors than fully


complementary gates.
 Static gates based implementation gives better noise
immunity

CS G553 13
Static Gate Four-Input Mux

 Delay through n-input


NAND is (n+2)/3.
 lg b + 1 inputs at first level,
so delay is (lg b + 3)/3.
 Delay at second level is
(b+2)/3.
 Delay grows as b lg b.

CS G553 14
Tree-based four-input mux

 Delay proportional to
square of path length.
 Delay grows as lg b2.

CS G553 15
Delay Through an RC Transmission Line

 An RC Transmission line models a wire as infinitesimal RC


sections
o Each representing a differential resistance and capacitance.

o The transmission line’s voltage response is modeled by

1 d 2V dV
= c
r dx 2 dt

CS G553 16
Delay Through an RC Transmission Line

 Elmore Delay
o Elmore defined the delay through a linear network as the first
moment of the impulse response of the network


 E =  tVout (t )dt
0

CS G553 17
Delay Through an RC Transmission Line

 Elmore Delay
o Elmore delay can be computed by taking the sum of RC products,
where each resistance is multiplied by the sum of all the
downstream capacitors

n
1
 E =  r (n − i )c = rc  n(n − 1)
i =1 2

CS G553 18
Pass-Transistor-Based Four-Input Mux

 Must include decode logic


in total delay.

CS G553 19
Transmission Gates based MUX

CS G553 20
LE Output Drivers

 Must drive load:


o Wire
o Destination LE
 Different types of wiring present different loads.

CS G553 21
Avoiding Programming Hazards

 Want to disable connections to routing channel


before programming.

From LE

config
progb
Routing channel

CS G553 22
Interconnect circuits

 Why so many types of


interconnect?
o Provide a choice of delay
alternatives.
o FPGA has
• Short wires
• General-purpose wires
• Global interconnects
• Specialized clock distribution
network Short wire has almost equal delay of a Gate
 Sources of delay: LE slower than Gates, interconnect gives some
compensation
o Wires.
o Programming points.

CS G553 23
Styles of Programmable Interconnection
Point
Researchers concluded that most of the area in an SRAM-based FPGA is consumed
by the routing switches, so we must carefully design the programmable interconnect
to make best use of the available area

pass transistor Three-state

CS G553 24
Pass Transistor Programmable Interconnect
Point
 Small area.
 Resistive switch.
 Delay grows as the square
of the number of switches.

CS G553 25
Three-state Programmable Interconnection
Point
 Larger area.
 Regenerative driver.

CS G553 26
Switch Area * Wire Delay vs. Pass Transistor
Width (Betz & Rose)

© 1999 IEEE

CS G553 27
Switch Area * Wire Delay vs. Buffer Size
(Betz & Rose)

© 1999 IEEE

CS G553 28
Wire Delay vs. Switch Sizes (Chandra and
Schmit)

 Delay vs. switch


size for various
driver sizes.
 U-shaped curve:
o Resistance initially
decreases.
o Increased
capacitance
eventually
dominates.

© 2002 IEEE
CS G553 29
Clock Drivers

Clock driver tree:

CS G553 30
Clock Nets

 Must drive all LEs.


 Design parameters:
o number of fanouts
o load per fanout
o wiring tree capacitance
 Determine optimal buffer sizes.

CS G553 31
H tree

 Regular layout
structure.
o Recursive.

CS G553 32
Architecture of FPGA Fabrics

CS G553 33
Topics

 Architecture of FPGA
o Logic elements
o Interconnect
o Pins

CS G553 34
Architectural Issues
 How many logic elements in the FPGA?
 LE structure:
o What functions?
o How many inputs?
o Dedicated logic?
 What types of interconnect?
o How much of each type?
 How long should interconnect segments be?
 How should we vary interconnect?
o Uniform or non-uniform over chip?

CS G553 35
FPGA Architecture Evaluation Methodology

FPGA Logic
fabric benchmarks
architecture

Place +
route

Area and
performance
evaluation

metrics
CS G553 36
Evaluation Metrics

 Structural
o Size of the logic element
o Size of interconnect
 Mapping-related
o Logic utilization
o Interconnect utilization
o Delay

CS G553 37
Logic Element Parameters

 How many inputs?


o Too few inputs---more overhead per LE
o Too many inputs---wasted capacity when mapping logic to LEs
 Depends on circuit design of LE and characteristics of logic.
 Typical choice: 4-inputs.

CS G553 38
Styles of FPGA Interconnect

 Local
 Intermediate
 Global
o clock
o signal

CS G553 39
Interconnect Paths

channel
LE

SW channel

LE channel

channel SW

CS G553 40
Pinout

 How many pins?


o Limited by technology.
o Too much logic, not enough
pins means we can’t get
signals off-chip.
o Too many pins means
logic/pins won’t be efficiently
utilized.

CS G553 41
Rent’s Rule

 Developed by E. F. Rent (IBM) in 1960.


o Experimentally derived from sample designs.
 Number of pins vs. number of components is a line on a
log-log plot:
o Rent’s Rule N p = K p N gb

 Parameters may vary based on technology:


o Rent measured b = 0.6, Kp = 2.5.
o Modern microprocessor has b = 0.455, Kp = 0.82.

CS G553 42
FPGAs and Pins

 Chip capacity is growing somewhat faster than package


pinout.
 Harder to use logic in a multi-FPGA design.
o Must try to fit a large function with a small interface into the FPGA.

CS G553 43
The End

 Questions ?

 Thank you for your attention

CS G553 44

You might also like