0% found this document useful (0 votes)
17 views26 pages

FPGA Lec04 Unfolding

Uploaded by

Farhan Mashuk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views26 pages

FPGA Lec04 Unfolding

Uploaded by

Farhan Mashuk
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Advanced Topics in Communications Electronics

FPGA Information Processing Systems

Lecture 4 Unfolding and Retiming


Chapter 4, the book by Keshab Parhi

Lecture 4 1
Example of Unfolding

a
y(n) 9D a
x(2k) + X
y(2k) 5D

x(n) + X
y(2k+1) 4D a
x(2k+1) + X

Lecture 4 2
J-Slow from J-Unfolding

If input to delay element


is x(kJ+m)
Output is x(kJ+m-J) a

If J == 2, input y(2k+1)
x(2k) + 5D
X
to D y(2k)
Output is y(2k-1)
y(2k+1) 4D a
For 5D, output is y(2k-9)
x(2k+1) + X

Lecture 4 3
Unfolding Algorithm for Factor J

Lecture 4 4
Example of Unfolding

B
9D A0 C0 D0
5D
B0
A C D
B1 4D

A1 C1 D1

Lecture 4 5
Unfolding without Loops
!"#
𝑢! → 𝑣 !"# %% with %
delays

x(4k) u0 v0 y(4k-37)
10D
9D
37D x(4k+1) u1 v1 y(4k-36)
u v
x(n) y(n-37) 9D
x(4k+2) u2 v2 y(4k-35)

9D
x(4k+3) u3 v3 y(4k-34)

For circuits without loops, unfolding == parallel processing


Lecture 4 6
Another Example
2D
2D
2D
D u0 v0 s0
u v D
2D
5D 6D u1 v1 s1

s
u2 v2
2D s2

Lecture 4 7
Properties of Unfolding

Lecture 4 8
Critical Path

Lecture 4 9
Retiming

D
D
X X
D

Number of delay elements (registers) along each


complete path or loop is preserved

Lecture 4 10
Effect of Retiming

(4) (4)
D
s s
(4) D (4) D
(1) q D
(1) q t t
(0) (1) 2D (0) (0) (1) D (0)
p r u p r u
Critical path (period) 6 Critical path (period) 4

Lecture 4 11
Clock Skew and Useful Skew
FF Delay = 9 FF Delay = 5
FF

t1 t2 t3
Clock

Zero skew: t1 = t2 = t3, clock period = 9

FF Delay = 9 FF Delay = 5
FF

t1 t2 t3
Clock

Useful skew: t1 = t3, t2 = t3 + 2, clock period = 7


Lecture 4 12
Clock Skew Optimization
T: clock period
ti: clock signal arrival time at flip-flop i
Dij: combinational circuit delay between flop i and j

Minimize T
s.t. ti + Dij,max ≤ tj + T – tsetup
ti + Dij,min ≥ tj + thold
for all flop pairs (i , j ) with combinational logic in between

Lecture 4 13
Sample Period Reduction
s0
(4)
(4) (4)
(1) q0 t0 D
s
(4) D (1) D
(1) q t p0 r0 u0

(0) (1) 2D (0) s1


(4)
p r u (4)
(1) q1 t1

(1) D
p1 r1 u1
Lecture 4 14
Fractional Iteration Bound
(1) (1) (1) (1)
s0 t1 u1 v2
(1) (1) (1) (1) D
D D
s t u v (1) (1) (1) (1)
D
D s1 t2 u2 v0

(1) (1) (1) (1)


D
s2 t0 u0 v1

Lecture 4 15
Overlapping Scheduling
D
Iteration D
bound: 3.5 B0 C0 A1
(1) A D
B C A0 B1 C1
(2) (4)
D
Precedence C B0 C0 A1
graph B
A A0 B1 C1

P1: B C B0 C0 A1
P2: A A0 B1 C1
Time
Lecture 4 16
Word-Level Parallel Processing
X
x(n)
C B A
c X b X a X
y(n) 2D 4D
2D 4D D E
+ + x(3k)
X0
C0 B0 A0 c X b X a X
y(3k)
D + +
D D0 E0 2D
X1 2D x(3k+1)
D c X b X a X
C1 B1 A1 D y(3k+1)
D + +
D D1 E1
X2 x(3k+2)
b
c X X a X y(3k+2)
C2 B2 A2 D
D + +
D2 E2 Lecture 4 17
Parallelism Levels

a5 b5
a4 b4 Digit-
a3 b3 a4 a2 a0 serial b4 b2 b0
Bit-
a2 parallel b2 a5 a3 a1 Digit- b5 b3 b1
a1 b1 size: 2
a0 b0

a5 a4 a3 a2 a1 a0 Bit-serial b5 b4 b3 b2 b1 b0

Lecture 4 18
Bit-Serial Adder

a3 a2 a1 a0 s3 s2 s1 s0

b3 b2 b1 b0
+ D

4p+0 4p+1, 2, 3

0 p is word index
0, 1, 2, 3 are bit indices

Lecture 4 19
Unfolding Switches
Wp+q p is word index
u v q is bit index

Lecture 4 20
Example of Unfolding Switch

12p+1,7,9,11 4p+3
u v u0 v0
12p + 1 = 3(4p + 0) + 1 4p+0,2
12p + 7 = 3(4p + 2) + 1
12p + 9 = 3(4p + 3) + 0 u1 v1
12p + 11 = 3(4p +3) + 2
4p+3
u2 v2

Lecture 4 21
Dummy Node
A
2D 6p+1,5 A
2D D 6p+1,5
C C
B 6p+0,2,3,4 B 6p+0,2,3,4

D
A1 D0
C0
B0 C0
B0 2p+0,1
D D
A2 D1 2p+0 A2 2p+0
C1 C1
B1 2p+1 B1 2p+1

A0 D2 2p+1 A0 2p+1
C2 C2
B2 2p+0 B2 2p+0
Lecture 4 22
2-Unfolding Bit-Serial Adder

+
a3 a2 a1 a0 s3 s2 s1 s0 A S
b3 b2 b1 b0
D X
B D D
4p+0
4p+0 4p+1, 2, 3 Z
0 4p+1, 2, 3

A0 S0
X0 A1 S1
D
B0 D0 X1
2p+0
B1 D1
Z0
2p+1 Z1
Lecture 4 23
Unfolding Bit-Serial Adder
(J=2)
s3 s2 s1 s0

+
a3 a2 a1 a0

b3 b2 b1 b0

D
a2 b2 a3 b3
4p+0 a0 b0 a1 b1
4p+1, 2, 3

0
2p+0

+ +
2p+0
0
D
Carry out
2p+1

s2 s3
s0 s1

Lecture 4 24
Unfolding Bit-Serial Adder
(J=4)
s3 s2 s1 s0

+
a3 a2 a1 a0

b3 b2 b1 b0

4p+0
4p+1, 2, 3

0
a0 b0 a1 b1 a2 b2 a3 b3

0
+ + + + Carry out

s0 s1 s2 s4
Lecture 4 25

You might also like