0% found this document useful (0 votes)
233 views13 pages

Unfolding Techniques in DSP

The document discusses unfolding as a technique for parallel processing and sample period reduction in digital signal processing graphs (DFGs). It defines unfolding as duplicating nodes in a DFG to allow parallel execution. The key points are: 1) Unfolding a DFG by a factor of J duplicates each node J times, with edges connecting the nodes to preserve precedence while allowing parallel execution on J processing elements. 2) Unfolding can reduce the sample period to match the iteration period or handle non-integer periods. It is used when node delays exceed the period or period is non-integer. 3) Unfolding preserves the number of delays and iteration period in the original DFG while enabling parallel execution

Uploaded by

sushant sahoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
233 views13 pages

Unfolding Techniques in DSP

The document discusses unfolding as a technique for parallel processing and sample period reduction in digital signal processing graphs (DFGs). It defines unfolding as duplicating nodes in a DFG to allow parallel execution. The key points are: 1) Unfolding a DFG by a factor of J duplicates each node J times, with edges connecting the nodes to preserve precedence while allowing parallel execution on J processing elements. 2) Unfolding can reduce the sample period to match the iteration period or handle non-integer periods. It is used when node delays exceed the period or period is non-integer. 3) Unfolding preserves the number of delays and iteration period in the original DFG while enabling parallel execution

Uploaded by

sushant sahoo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

Chapter 5: Unfolding

Keshab K. Parhi
• Unfolding ≡ Parallel Processing
2-unfolded
(1) (1)
A B (1) (1) 0,2,4,….
A0 B0
2D
T’∞ = 2ut D
A0àB0=> A2àB2=> A4àB4=>…..
A1àB1=> A3àB3=> A5àB5=>….. (1) (1) 1,3,5,….
2 nodes & 2 edges A1 B1
T∞ = (1+1)/2 = 1ut
T’∞ = 2ut D
4 nodes & 4 edges
T∞ = 2/2 = 1ut

• In a ‘J’ unfolded system each delay is J-slow => if input to a delay element
is the signal x(kJ + m), the output is x((k-1)J + m) = x(kJ + m – J).

Chap. 5 2
• Algorithm for unfolding:
Ø For each node U in the original DFG, draw J node U0 , U1 ,
U2 ,…, UJ-1 .
Ø For each edge U → V with w delays in the original DFG,
draw the J edges Ui → V(i + w)%J with (i+w)/J delays for i
= 0, 1, …, J-1.
37D U0 9D V0
U V
U1 9D V1
w = 37
⇒(i+w)/4 = 9, i = 0,1,2 U2
9D
V2
=10, i = 3
U3 10D V3

ØUnfolding of an edge with w delays in the original DFG


produces J-w edges with no delays and w edges with 1delay in
J unfolded DFG for w < J.
ØUnfolding preserves precedence constraints of a DSP
program.
Chap. 5 3
2D

U
D
V U0 V0 2D T0
3-unfolded
5D 6D
U1 V1 2D T1
T DFG
2D
U2 D V2 2D T2
D
Properties of unfolding :
Ø Unfolding preserves the number of delays in a DFG.
This can be stated as follows:
w/J + (w+1)/J + … + (w + J - 1)/J = w
Ø J-unfolding of a loop l with wl delays in the original DFG
leads to gcd(wl , J) loops in the unfolded DFG, and each
of these gcd(wl , J) loops contains wl/ gcd(wl , J) delays
and J/ gcd(wl , J) copies of each node that appears in l.
Ø Unfolding a DFG with iteration bound T∞ results in a J-
unfolded DFG with iteration bound JT∞ .

Chap. 5 4
• Applications of Unfolding
Ø Sample Period Reduction
Ø Parallel Processing
• Sample Period Reduction
Ø Case 1 : A node in the DFG having computation
time greater than T∞ .
Ø Case 2 : Iteration bound is not an integer.
Ø Case 3 : Longest node computation is larger
than the iteration bound T∞, and T∞ is not an
integer.

Chap. 5 5
Case 1 :

ØThe original DFG cannot


have sample period equal to
the iteration bound
because a node computation
time is more than iteration
bound

Ø If the computation time of a node ‘U’, tu, is greater than the


iteration bound T∞, then tu/T ∞  - unfolding should be used.
Ø In the example, tu = 4, and T∞ = 3, so 4/3 - unfolding i.e., 2-
unfolding is used.

Chap. 5 6
• Case 2 :

ØThe original DFG cannot


have sample period equal
to the iteration bound
because the iteration
bound is not an integer.

ØIf a critical loop bound is of the form tl/wl where tl and wl


are mutually co-prime, then wl-unfolding should be used.
ØIn the example tl = 60 and wl = 45, then tl/wl should be
written as 4/3 and 3-unfolding should be used.
•Case 3 : In this case the minimum unfolding factor that allows
the iteration period to equal the iteration bound is the min
value of J such that JT∞ is an integer and is greater than the
longest node computation time.
Chap. 5 7
• Parallel Processing :
Ø Word- Level Parallel Processing
Ø Bit Level Parallel processing
vBit-serial processing
vBit-parallel processing
vDigit-serial processing

Chap. 5 8
• Bit-Level Parallel Processing

a0 b0
a1 b1
a2 Bit-parallel a3 a2 a1 a0 Bit-serial b3 b2 b1 b0
b2
a3 b3

a2 a0 b2 b0
Digit-Serial
(Digit-size = 2)
a3 a1 b3 b1

a3 a2 a1 a0 s3 s2 s1 s0
Bit-serial
b3 b2 b1 b0
adder
D
4l+0 4l+1,2,3
0
Chap. 5 9
• The following assumptions are made when
unfolding an edge U→V :
Ø The wordlength W is a multiple of the unfolding factor J,
i.e. W = W’J.
Ø All edges into and out of the switch have no delays.
• With the above two assumptions an edge U→V can
be unfolded as follows :
Ø Write the switching instance as
Wl + u = J( W’l + u/J ) + (u%J)
Ø Draw an edge with no delays in the unfolded graph from
the node Uu%J to the node Vu%J , which is switched at time
instance ( W’l + u/J ) .

Chap. 5 10
Example : 4l + 3

U0 V0
12l + 1, 7, 9, 11 Unfolding by 3 4l + 0,2
U V U1 V1
4l + 3

U2 V2

To unfold the DFG by J=3, the switching instances are as follows


12l + 1 = 3(4l + 0) + 1
12l + 7 = 3(4l + 2) + 1
12l + 9 = 3(4l + 3) + 0
12l + 11 = 3(4l + 3) + 2

Chap. 5 11
• Unfolding a DFG containing an edge having a switch and a
positive number of delays is done by introducing a dummy
node.
2D 2D 6l + 1, 5
A 6l + 1, 5 A D
Inserting
C C
Dummy node
B 6l + 0, 2, 3, 4 B 6l + 0, 2, 3, 4

A0 D D0
2l + 0

2l + 1 C0 B0 C0
A1 D1
D A2 D 2l + 0
A2 D2 2l + 0

C1 C1
B0 2l + 1
B1 2l + 1

B1 2l + 1 B2 2l + 0

C2 C2
Chap. 5
B2 2l + 0 A0 2l + 1 12
• If the word-length, W, is not a multiple of the
unfolding factor, J, then expand the switching
instances with periodicity lcm(W,J)
• Example: Consider W=4, J=3. Then lcm(4,3) = 12.
For this case, 4l = 12l + {0,4,8), 4l+1 = 12l + {1,5,9},
4l+2 = 12l + {2,6,10}, 4l+3 = 12l + {3,7,11}. All new
switching instances are now multiples of J=3.

Chap. 5 13

You might also like