Assignment 2
Assignment 2
𝑥𝑥 · 𝑦𝑦 = � 𝑥𝑥𝑖𝑖 𝑦𝑦𝑖𝑖
𝑖𝑖=0
If we allow a vector to pair with itself, then there are N(N+1)/2 such pairs in total for N vectors.
Assuming N is equal to 5 for example, we then have the following 15 vector pairs:
(3, 3)(3, 4)
(4, 4)
In the above dot product formula, (i, j) denotes the pair of sequences i and j. After the dot
product computation for pair (i, j), we will have a single value as the output. Let 𝑎𝑎𝑖𝑖 and 𝑎𝑎𝑗𝑗 be
two sequences, namely the 𝑖𝑖 𝑡𝑡ℎ and 𝑗𝑗 𝑡𝑡ℎ row vectors in the 2D matrix, and 𝑐𝑐𝑖𝑖𝑖𝑖 = 𝑎𝑎𝑖𝑖 ∙ 𝑎𝑎𝑗𝑗 .
In your parallel algorithm design, you must consider how to (1) balance the workload, (2)
minimize communication overhead, and (3) use the loop unrolling technique with the unrolling
factor equal to 4 to improve the performance.
Your implementation must be in C and use MPI and OpenMP as described earlier in this
assignment description.
You must write a report. The report must be concise, clear (3-6 A4 pages) and contain the
following sections:
Your assignment will be marked on the efficiency of your algorithm, program logic and
readability, accuracy of results, and quality of your report.
Submission Requirements
1. Your submission must be made by 11:59pm on Friday, 26 May, 2023 (Sydney time).
2. Create a tar or zip file that contains your report, makefile and source files (e.g., .c and .h
files). DO NOT INCLUDE ANY OBJECT OR BINARY FILES.
3. Submit only one .tar or .zip file.
Failure to follow these submission requirements may lead to loss of marks.