0020.matrix Multiplication Systolic
0020.matrix Multiplication Systolic
M M
PE
PE PE PE
EECC756 - Shaaban
#1 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid b2,2
• Each processor accumulates one b2,1 b1,2
element of the product b2,0 b1,1 b0,2
b1,0 b0,1
Alignments in time b0,0
Columns of B
Rows of A
b1,0 b0,1
a0,0*b0,0 a0,0*b0,1
a0,1 + a0,1*b1,0 a0,0
a0,2
b0,0
a1,0*b0,0
a1,2 a1,1 a1,0
T=2
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#4 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid
• Each processor accumulates one
element of the product
b2,2
Alignments in time
b2,1 b1,2
b2,0 b1,1 b0,2
a0,0*b0,0 a0,0*b0,1
a0,2 + a0,1*b1,0 a0,1 + a0,1*b1,1 a0,0 a0,0*b0,2
+ a0,2*b2,0
b1,0 b0,1
a1,0*b0,0
a1,1 a1,0 a1,0*b0,1
a1,2 + a1,1*b1,0
b0,0
a2,0*b0,0
a2,0
a2,2 a2,1
T=3
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#5 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid
• Each processor accumulates one
element of the product
Alignments in time
b2,2
b2,1 b1,2
a0,0*b0,0 a0,0*b0,1
+ a0,1*b1,0 a0,2 + a0,1*b1,1 a0,1 a0,0*b0,2
+ a0,1*b1,2
+ a0,2*b2,0 + a0,2*b2,1
b1,0 b0,1
a2,0*b0,0 a2,0*b0,1
a2,2 a2,1 + a2,1*b1,0
a2,0
T=4
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#6 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid
• Each processor accumulates one
element of the product
Alignments in time
b2,2
a0,0*b0,0 a0,0*b0,1
+ a0,1*b1,0 + a0,1*b1,1 a0,2 a0,0*b0,2
+ a0,1*b1,2
+ a0,2*b2,0 + a0,2*b2,1
+ a0,2*b2,2
b2,1 b1,2
a1,0*b0,0
a1,2 a1,0*b0,2
+ a1,1*b1,0 a1,0*b0,1 a1,1 + a1,1*b1,2
+ a1,2*a2,0 +a1,1*b1,1
+ a1,2*b2,1
T=5
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#7 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid
• Each processor accumulates one
element of the product
Alignments in time
a0,0*b0,0 a0,0*b0,1
a0,0*b0,2
+ a0,1*b1,0 + a0,1*b1,1
+ a0,1*b1,2
+ a0,2*b2,0 + a0,2*b2,1
+ a0,2*b2,2
b2,2
a1,0*b0,0
a1,0*b0,2
+ a1,1*b1,0 a1,0*b0,1 a1,2 + a1,1*b1,2
+ a1,2*a2,0 +a1,1*b1,1
+ a1,2*b2,1 + a1,2*b2,2
b2,1 b1,2
a2,0*b0,0 a2,0*b0,1 a2,0*b0,2
+ a2,1*b1,0
a2,2 + a2,1*b1,1 a2,1 + a2,1*b1,2
+ a2,2*b2,0 + a2,2*b2,1
T=6
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#8 lec # 1 Spring 2003 3-11-2003
Systolic Array Example:
3x3 Systolic Array Matrix Multiplication
• Processors arranged in a 2-D grid
• Each processor accumulates one
element of the product
Alignments in time
a0,0*b0,0 a0,0*b0,1
a0,0*b0,2
+ a0,1*b1,0 + a0,1*b1,1
+ a0,1*b1,2
+ a0,2*b2,0 + a0,2*b2,1
+ a0,2*b2,2
a1,0*b0,0
a1,0*b0,1 a1,0*b0,2
+ a1,1*b1,0
+a1,1*b1,1 + a1,1*b1,2
+ a1,2*a2,0
+ a1,2*b2,1 + a1,2*b2,2
Done
b2,2
a2,0*b0,0 a2,0*b0,1 a2,0*b0,2
+ a2,1*b1,0 + a2,1*b1,1 a2,2 + a2,1*b1,2
+ a2,2*b2,0 + a2,2*b2,1 + a2,2*b2,2
T=7
EECC756 - Shaaban
Example source: http://www.cs.hmc.edu/courses/2001/spring/cs156/
#9 lec # 1 Spring 2003 3-11-2003