0% found this document useful (0 votes)

35 views

Proj Overview

mspvm

Uploaded by

Likhitha Likky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

35 views

Proj Overview

mspvm

Uploaded by

Likhitha Likky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 11

ESE-507 Advanced Digital System Design and Generation

Fall 2023

Project Overview:
Matrix-Sparse Vector Mult. Hardware
Issued: 9/25/23, Due: 12/8/23, 11:59 PM

This project specification is contained in six documents. This document contains the overview of
the project. (You should start here.) Please see other accompanying PDFs for detailed specifications
and tasks for each of the five tasks (“Part 1” through “Part 5”) of the project.

1. Introduction
In this project, you will design, implement, simulate, and synthesize a hardware system for
performing matrix-vector multiplication, where the matrix is dense, and the vector is sparse. This
is called “matrix-sparse vector multiplication,” which we will abbreviate as MSpVM. You will
turn in:
- your documented and commented code
- clearly labeled synthesis reports
- a report answering all questions and including requested information

I will run additional simulations on the code you turn in, so it is very important to:
1. Make sure your designs simulate correctly using QuestaSim on the lab computers or the
CAD servers.
2. Carefully organize your code as specified in this document.
3. Make sure the names and behavior of all signals match this specification exactly.
4. Carefully label and document your code.

Your project will be evaluated on correctness and efficiency of your design, the quality of your
report, and your answers to questions in the report.

You may work alone or with one partner on this project. You may not share code with others
(except your partner). This means you may not allow others to see your code, nor may you
read others’ code (for this or related projects). All code will be run through an automatic
code comparison tool. Plagiarism will result in a score of zero on the assignment for all
involved parties. If you have questions as to what is acceptable, please come to office hours or
send Prof. Milder email to ask for clarification.

If you have general questions about the project, please post them Piazza.

File Organization
This project is broken into five parts. To make it possible for me to grade and understand your
work, please carefully organize your files. Use a separate sub-directory for each part (called
part1/ part2/ part3/ part4/ and part5/). Then be sure to name your files and modules as
specified in the description below. Make sure all your files are stored inside of a private work
directory like the ese507work directory you made in the HW2 Tool Tutorial.

Page 1 © 2023 Peter Milder

ESE 507 Project Overview

Point Breakdown
1. Part 1: Multiply-Accumulate Unit [15 points]
2. Part 2: Output FIFO [15 points]
3. Part 3: Input Memory Module [20 points]
4. Part 4: Matrix-Sparse Vector Multiplier (MSpVM) [25 points]
5. Part 5: Throughput Optimization [20 points]
6. Quality of report, code, comments, and organization [5 points]

Getting Started
As you can see, this project is large and complex. This document provides a high-level overview
and some important background information. Then, the accompanying documents give the
specification and tasks for each of the five parts of the project. Begin by carefully reading this
overview, and then you should spend some time looking through Parts 1–5. Then when you are
ready to start working, see the Part 1 document.

2. Partner
You may work alone or in a team of two students for this project. If you choose to work with a
partner, it is important that both partners contribute fully to all phases of the project. Your report
will require you to describe each partner’s contribution to the project. Unequal contributions may
be reflected in scoring.

If you are choosing to work with a partner, by Monday 10/2 at 11:59pm you must:
• Send an email to [email protected] with the subject “ESE 507 Project Partner
Signup”
• Send the email from your @stonybrook.edu email address
• In the body of the email, write both your name and your partner’s name
• CC your partner on the email (using your partner’s @stonybrook.edu email address)

After this, you are committed to work with this partner on this project for the entire semester.

3. Background
3.1 Matrix-Vector Multiplication
We first begin by reviewing matrix-vector multiplication. As an example, let W represent a square
3´3 matrix, and let x represent a (column) vector of length 3. The product y = W x is defined as:
2 3 2 32 3 2 3
y0 w0,0 w0,1 w0,2 x0 w0,0 x0 + w0,1 x1 + w0,2 x2
4y1 5 = 4w1,0 w1,1 w1,2 5 4x1 5 = 4w1,0 x0 + w1,1 x1 + w1,2 x2 5
y2 w2,0 w2,1 w2,2 x2 w2,0 x0 + w2,1 x1 + w2,2 x2

So, this system takes in 12 values (the 3´3 matrix W3 and the 3´1 column vector x) and produces 3
values (3´1 column vector y).

Page 2 © 2023 Peter Milder

ESE 507 Project Overview

We can also use array notation and represent this operation as computing (for m = 0,1,2):
2
X
<latexit sha1_base64="xrW45nkOpUls4QagyiBGDNhJylA=">AAACEXicbZDLSgMxFIbPeK31NurSTbAIrsqMiLopFN24rGAvMB2HTJpqMMkMSUYsQ1/Bja/ixoUibt25821Mp11o6w+Bj/+cw8n545QzbTzv25mbX1hcWi6tlFfX1jc23a3tlk4yRWiTJDxRnRhrypmkTcMMp51UUSxiTtvx3fmo3r6nSrNEXplBSkOBbyTrM4KNtSL3YBCIENVQV2ciymXNG17nh0PUtm4gQ9QlvcSgB4uRW/GqXiE0C/4EKnUXCjUi96vbS0gmqDSEY60D30tNmGNlGOF0WO5mmqaY3OEbGliUWFAd5sVFQ7RvnR7qJ8o+aVDh/p7IsdB6IGLbKbC51dO1kflfLchM/zTMmUwzQyUZL+pnHJkEjeJBPaYoMXxgARPF7F8RucUKE2NDLNsQ/OmTZ6F1WPWPq0eXR5X62TgNKMEu7MEB+HACdbiABjSBwCM8wyu8OU/Oi/PufIxb55zJzA78kfP5A007nMA=</latexit>

y[m] = W [m][n] · x[n]

n=0

So, each output value y[m] is computed by multiplying and adding the appropriate values of the
matrix W and input vector x.

3.2 Matrix-Vector Multiplication with Generalized Dimensions

In this project, you will consider matrix-vector multiplications parameterized by the matrix and
vector dimensions. Specifically, let W be a matrix with M rows and N columns, let x represent a
column vector of length N, and let y represent a column vector of length M. Then we can represent
this operation as:
2
<latexit sha1_base64="GwbGpcudaF9DOUENLKFJAiWvQkk=">AAADJXicbVJda9RAFJ3EqjV+bfXRl6GLxQddMlJsoRQWffGlpYLbLWyWMJnc3Q6dTMLMpGwI+2d88a/44oNFCn3qX3GSrK1NvBByOPfck3MnE2WCa+P7V457b+3+g4frj7zHT54+e97beHGs01wxGLFUpOokohoElzAy3Ag4yRTQJBIwjs4+Vf3xOSjNU/nVFBlMEzqXfMYZNZYKN5y9III5l2WUUKP4YukVoY+DABchqV7BeZwa3RDlwTuy9AKQ8a1632vPj8PSf+sv8RauEalQUJusmEPrEgSVjtzoSEdH/upWCbbwLYjvMI2XzXbjVuG2X8UddvO30y+a7Rft7Rdh2Z0Oe31/4NeFu4CsQH/YQ3Udhb0Lm4jlCUjDBNV6QvzMTEuqDGcCrHuuIaPsjM5hYqGkCehpWf/lJX5tmRjPUmUfaXDN/jtR0kTrIoms0uY71e1eRf6vN8nNbHdacpnlBiRrPjTLBTYprq4MjrkCZkRhAWWK26yYnVJFmbEXy7OHQNord8Hx+wH5MNj+st0ffmxOA62jV2gTvUEE7aAh+oyO0Agx55vzw/nlXLjf3Z/ub/eykbrOauYlulPu9R8a7fXU</latexit>

3 2 32 3
y0 W0,0 W0,1 ... W0,N 1 x0
6 y1 7 6 W1,0 W1,1 ... W1,N 76 x1 7
6 7 6 1 76 7
6 .. 7=6 .. .. .. .. 76 .. 7
4 . 5 4 . . . . 54 . 5
yM 1 WM 1,0 WM 1,1 ... WM 1,N 1 xN 1

Or, in array notation:

N
X1
<latexit sha1_base64="KwJc1Z9mnvQ3gfZkQWeVa8G6M8o=">AAACMXicbVDLTttAFL2mvJoWasqymxFRpS5CZKPy2ESKyoYNFUiEINluNJ6MYcTM2Jq5rogs55O66Z9U3bAAoW75CSYJCx49q6Nz7tG996SFFBaD4NqbezO/sLi0/Lbx7v3K6gd/7eOpzUvDeI/lMjdnKbVcCs17KFDys8JwqlLJ++nl/sTv/+TGilyf4KjgiaLnWmSCUXTSwD8YRSohHRLbUg0q3QnqH9X3zbAmfadHOiExG+ZIrhxtkfF4HCO/wirLDamJ6gSt2Lm2dbgZDvxm0A6mIK9J+EiaXR+mOBr4f1yWlYprZJJaG4VBgUlFDQomed2IS8sLyi7pOY8c1VRxm1TTj2vy2SlDMjkjyzWSqfo0UVFl7UilblJRvLAvvYn4Py8qMdtLKqGLErlms0VZKQnmZFIfGQrDGcqRI5QZ4W4l7IIaytCV3HAlhC9ffk1Ot9rhTnv7+Guz+23WBizDJ9iALxDCLnThAI6gBwx+wV+4gVvvt3ft3Xn/ZqNz3mNmHZ7Bu38A04Go1A==</latexit>

y[m] = W [m][n] · x[n], for m = 0, . . . , M 1

n=0

Or, in pseudocode:

for m = 0 ... M-1:

y[m] = 0
for n = 0 ... N-1:
y[m] += W[m][n] * x[n]

Computing each of the M values in y requires performing N multiplications and summing up their
results. In total, this requires MN multiplications and M(N–1) additions.

In this project, M and N will be parameters of your hardware system. That is, you will design a
system in SystemVerilog that has parameters M and N which can be changed in the code.

3.3 Matrix-Sparse Vector Multiplication (MSpVM)

We say a matrix or vector is considered sparse if many of its elements are equal to 0. If a
matrix/vector is not sparse, we call it dense. Sparse data occurs in a very wide number of
applications in science and engineering such as solving partial differential equations, circuit
simulation, graph theory, and machine learning. Often, these applications operate on very large,
very sparse matrices. In this project, you study a simpler (but useful) variation of this problem:
multiplication of a relatively small dense matrix with a sparse vector. Specifically, this problem is
inspired by recent research on sparsity in transformer networks (e.g., GPT), which are commonly
used in natural language applications such as chatbots.

Page 3 © 2023 Peter Milder

ESE 507 Project Overview

We will use D to denote the number of non-zero entries in vector x. (Necessarily, 1 ≤ D ≤ N.) Here
is an example of MSpVM of a dense 3x4 matrix with a sparse vector that has two non-zero entries.
(That is, M=3, N=4, and D=2.) 2 3
2 3 2 3 4 2 3 2 3
<latexit sha1_base64="F4crmD7LuPJoCPT7sjKhj8CtCBA=">AAADKHichVJLb9NAEF67PIp5NKVHLisiEBJS5E3cJD2AIrhwLBJpK8WRtV5v0lXXa2t3jbCs/pxe+CtcKgRCvfJLGDsRAQeJkWbn8zczOw9vnEthrO/fOO7Ordt37u7e8+4/ePhor7P/+MRkhWZ8yjKZ6bOYGi6F4lMrrORnueY0jSU/jS/e1v7Tj1wbkakPtsz5PKVLJRaCUQtUtO+8DmO+FKqKU2q1+HTplZGPwxCXEVmZvhdylWz8r7x2BsHPcR90ABqEoXcIdgg6Ah3D9xFY4tdHHUnaF7avC+q6/u9j0K6PtxsIWZJZHOCXUL+Bg7qNDTvesEcblvTX9P8nHEJmUM82anUfdbp+z28EbwOyBt1JBzVyHHW+hknGipQryyQ1Zkb83M4rqq1gksM2CsNzyi7oks8AKppyM6+aH32JnwGT4EWmQZXFDftnRkVTY8o0hkjo79y0fTX5L9+ssIvxvBIqLyxXbFVoUUhsM1y/GpwIzZmVJQDKtIBeMTunmjILb8uDJZD2yNvgpN8jw97h+6A7ebPaBtpFT9BT9AIRNEIT9A4doylizpXzxfnmfHc/u9fuD/dmFeo665wD9Je4P38BACTx/w==</latexit>

y0 1 2 3 4 6 7 1·4+4·3 16
4y1 5 = 45 6 7 8 5 607 = 4 5 · 4 + 8 · 3 5 = 4445
4 05
y2 9 10 11 12 9 · 4 + 12 · 3 72
3

Notice how we can skip computations related to the entries of x that are equal to 0 since they cannot
contribute to the result.

When working with sparsity, we use a sparse encoding to represent sparse data. This will allow us
to compactly represent the non-zero parts of the vector without keeping storing all of the 0s, and it
will allow us to build hardware that only performs arithmetic on the non-zero portions of the data.

To do so, we will use a simple format based on Compressed Sparse Column (CSC) encoding1 to
represent our sparse input vector x. In this encoding, only the non-zero entries of the vector are
stored, but alongside of each value, we must store which row that it belongs to. For example, we
would store the value of x in the example above as val = [4, 3] and row = [0, 3]. So, this
tells us that the value 4 is in row 0, the value 3 is in row 3, and all other values are 0.

As another example, if N = 10, and x is represented by val = [1, 2] and row = [5, 9], then this
corresponds to a column vector with values [0, 0, 0, 0, 0, 1, 0, 0, 0, 2] (and D=2).

Compressing our sparse vector in this way obviously can make it smaller (if D is small), but it has
another benefit: it allows hardware or software to perform computations while skipping the 0
entries. In pseudocode (where the matrix has M rows and N columns, and the vector, which has D
non-zero entries, is encoded in the sparse formatted described above):

for m = 0 ... M-1:

y[m] = 0
for d = 0 ... D-1:
n = row[d]
y[m] += W[m][n] * val[d]

To make sure you understand this pseudocode, it is useful to work out the 3x4 example given above
(where val = [4, 3] and row = [0, 3]).

Recall from above that a dense matrix-vector multiplication requires MN multiplications and M(N–
1) additions. Now, we can see that with a sparse vector with D non-zero entries, MSpVM requires
only MD multiplications and M(D–1) additions. If D is much smaller than N, this is a large
reduction in the number of computations to perform. (E.g., if N = 1000 and D = 10, you have
eliminated 99% of the computation.)

Your goal in this project is to build a hardware system that computes the product of a dense MxN
matrix with a sparse vector. Your SystemVerilog code will be parameterized (using SystemVerilog

1
When applied on matrices (instead of vectors), the CSC encoding is slightly more complex than this;
in this project we simplify the representation because only our vector is sparse.

Page 4 © 2023 Peter Milder

ESE 507 Project Overview

parameters) to allow the values of M and N to be easily changed in the code. The value of D will
vary based on the vector you give your system as input. The following section introduces the
specified structure of the design, its parameters, and the protocols it uses for input and output. Your
system will take in a stream of values that represent a matrix and a sparse vector, compute the
MSpVM, and output the result vector. Then your system will take in new inputs and repeat the
process.

4. High-Level Project Overview

The goal of your project is to build a hardware system for MSpVM—that is, multiplications of
dense matrices with sparse vectors. Figure 1 illustrates the top-level module and port specifications
of the system. On the left are five signals whose names start with INPUT_T. These signals form an
AXI-Stream interface that your system will use to receive input data. On the right there are three
signals whose names start with OUTPUT_T. These from another AXI-Stream interface you system
will use to transmit output data. A specification of the AXI Stream protocol for the inputs and
outputs is provided in Section 5 below. There are also clk and reset signals. Assume reset is
asserted high and synchronously applied.

clk OUTPUT_TVALID AXIS output

reset OUTPUT_TREADY interface
OUTPUT_TDATA for y
INPUT_TVALID matrix-sparse vector
multiplier
AXIS input INPUT_TREADY
interface for INPUT_TDATA
W and x INPUT_TUSER
INPUT_TLAST

Figure 1. Top-level Design and Port Specifications.

Figure 2 illustrates a high-level block diagram of the system you will construct. Each of the
components will be specified and described in more detail in the following sections.

matrix and output

vector values values
AXI Stream Input Memories MAC Unit Output FIFO AXI Stream
Input Data (Part 3) (Part 1) (Part 2) Output Data

vector row
encodings

mem. rd. addrs Control control and status

(Part 4)
control and status

Figure 2. High-Level Block Diagram

• Your system will take as input a stream of data in AXI-Stream format that represents a
matrix and a sparse input vector. (The sparse vector is encoded as described in Section 3
above.) Your system will perform a MSpVM of these and produce the result as output. The
system’s output values will be provided in AXI-Stream format. (The AXI-Stream protocol

Page 5 © 2023 Peter Milder

ESE 507 Project Overview

and its use are described below in Section 5.) After completing a MSpVM, your system
will accept a new set of inputs to compute. (In other words, your system will keep
computing matrix-sparse vector products as long as new inputs are provided.)

• A multiply-accumulate (MAC) unit will be used to perform the individual multiplications

and additions needed for the matrix-vector product. A MAC operation computes:

f += a*b

Take note of how this is the fundamental operation used in the matrix-vector produce
pseudocode described above. The MAC unit is Part 1 of the project, and it is described in
the Project Part 1 document.

• As the MAC unit computes values of the output vector, it places them in the Output FIFO
module, which is Part 2 of the project. The Output FIFO module will buffer the values and
output them from your system in AXI Stream format. This module is described in the
Project Part 2 document.

• Your system will also require input memories to store the matrix and vector values while
the system performs the computation. These are stored in the Input Memory module, which
is Part 3 of the project. This module will include a memory for the matrix, a memory for
the vector, and necessary control logic. You can read more about this in the Project Part 3
document.

• The goal of Part 4 of the project will be to integrate the three components from Parts 1–3
and design accompanying control logic that will allow the components to work together to
perform the full matrix-sparse-vector product. The control logic will be responsible for
coordinating the operation of the input memories, MAC unit, and output memories. You
can read more about Part 4 in the Project Part 4 document.

• Lastly, the goal of Part 5 of the project will be to optimize the speed of the system. Please
see the Project Part 5 document.

Page 6 © 2023 Peter Milder

ESE 507 Project Overview

Parameters
Rather than building hardware for a specific matrix size, you will design a parameterized system
to allow flexibility in the matrix/vector dimensions and in the number of bits used for input and
output values. This means that it will use the following SystemVerilog parameters:

• M: the number of rows of the matrix and rows of the output vector. Your system must
support M ≥ 2.
• N: the number of columns of the matrix and rows of the input vector. Your system must
support N ≥ 2.
• INW: the input bit width (the number of bits used per value in the input matrix and input
vector). Your system must support 2 ≤ INW < 32 bits.
• OUTW: the output bit width (the number of bits used per value in the output vector). Your
system must support 4 ≤ OUTW ≤ 64.
o OUTW must also be large enough to prevent overflow. Please see explanation in the
Project Part 1 document.

There is no defined upper bound on the limit of M and N, but as they get larger, the simulation and
synthesis time will grow.

5. AXI-Stream Input/Output Protocol

Your system will use a slightly simplified version of ARM’s AMBA AXI4-Stream protocol. (We
will refer to our simplified version in this document as AXI-Stream or AXIS for short.)

AXI-Stream is shown in its simplest form in Figure 3. It is a synchronous protocol (meaning both
sides share a common clock) that allows a transmitter2 module to transfer data to a receiver when
both sides “agree.”

TVALID

Transmitter TREADY Receiver

Module Module
TDATA

clk
Figure 3. Simplified AXI-Stream protocol signals

The transmitter asserts the TVALID signal when it has placed valid data on the TDATA signal. The
destination asserts the TREADY signal when it is capable of consuming that data. On any positive
clock edge, data is transferred if (and only if) both the TVALID and the TREADY signals are asserted.
(No data will ever be transferred unless both are asserted.) TVALID and TREADY are 1-bit signals,
while TDATA is multiple bits.

2
In earlier versions of the AXI standard, the transmitter was called a “master,” and the receiver
was called a “slave.” This terminology was changed to “transmitter” and “receiver” in ARM’s 2021
standard, although you will still see the older terms used in some places and CAD tools.

Page 7 © 2023 Peter Milder

ESE 507 Project Overview

Note that both source and destinations modules must share a common clock. We will call this
collection of signals (TVALID, TREADY, and TDATA) an “AXI-Stream interface.” Figure 4 and Table
1 illustrate this functionality and timing. In this example d[0], d[1], etc., represent the data words
transmitted.

clock

TVALID

TREADY

TDATA x d[0] d[1] d[2] d[3]

1 2 3 4 5 6 7 8 9

Figure 4. AXI-Stream data transfer timing example.

cycle # TVALID TREADY Explanation

1 0 0 Neither valid nor ready; nothing is transferred
2 1 0 Transmitter puts data on TDATA signal and
asserts TVALID. However, receiver module
hasn’t asserted TREADY so no data is transferred
3 1 0 Receiver module is still not ready (TREADY==0)
so no data is transferred
4 1 1 Transmitter module is now ready
(TREADY==1), so it receives data d[0].
5 1 1 Since d[0] was transferred on the previous
clock edge, the transmitter now changes the
data to the next word. This word is transferred
immediately. (Since TVALID and TREADY are
still asserted.)
6 1 1 This has the same logic as cycle 5. Data word
d[2] is transferred.
7 0 1 Now, the transmitter has de-asserted TVALID.
The destination module does not read anything
(regardless of what the source module has
placed onto TDATA).
8 1 0 The transmitter has asserted TVALID but the
receiver has de-asserted TREADY. Nothing is
transferred here.
9 1 1 Both TVALID and TREADY are asserted, so the
receiver reads d[3] from TDATA.

Table 1. AXI-Stream data transfer timing example.

ESE 507 Project Overview

The interaction of the TVALID and TREADY signals is called a handshake. Think of asserting
TVALID as the transmitter holding out a hand; think of asserting TREADY as the receiver holding
out a hand. If both sides hold out their hand, then they shake hands and agree that a data transfer is
complete.

In our simplified AXI-Stream protocol, the transmitter is not permitted to wait until TREADY is
asserted before asserting TVALID, and the receiver is not permitted to wait until TVALID is asserted
before asserting TREADY.3 In other words, both the transmitter and the receiver need to decide
independently whether to assert their signal. (Then at the positive clock edge, they each check to
see if the other side has asserted theirs.)

Additional AXI-Stream Signals

In addition to TDATA, TVALID, and TREADY, the AXI-Stream protocol includes several other
control signals. In this project, you will use two of them: TUSER and TLAST, shown in Figure 5.

TVALID

TREADY

Transmitter TDATA Receiver

Module Module
TUSER

TLAST

clk
Figure 5. AXI-Stream signals including TUSER and TLAST.

• TUSER is a multi-bit signal that transmits “sideband data” from the transmitter to the
receiver. Think of this as extra information that we transmit alongside of TDATA. This
signal is controlled in exactly the same way as TDATA: Anytime TREADY and TVALID are
1 on a positive clock edge, then the information on TUSER is also transmitted.

• TLAST is a 1-bit signal that the transmitter can use to indicate that the currently transmitted
data is the end of a transfer. Like TDATA and TUSER, this signal will be ignored except
when TREADY and TVALID are asserted on a positive clock edge.

Output Stream
Your system will utilize TDATA, TREADY, and TVALID for its output. (For output data, you will not
use TUSER or TLAST.) Your Output FIFO module (Part 2) will serve as the transmitter. When you
simulate, the testbench will be the receiver for this output data. (In a real system, the receiver would

3
One small difference between the full AXI-Stream protocol and our simplified version is that in the
complete AXI-Stream Protocol, the receiver may choose to wait until TVALID is asserted before
asserting TREADY, although we will not allow it in our project.

Another difference between our simplified AXI-Stream and the full protocol is that in the full protocol,
once the transmitter asserts TVALID, it must keep it asserted until the handshake occurs; we will allow
it to be de-asserted at any time.

ESE 507 Project Overview

be whatever component your system connects to.) So, your Output FIFO will have outputs TDATA
and TVALID and input TREADY. The testbench will have inputs TDATA and TVALID and output
TREADY.

Input Stream
Your system’s input interface will use all five signals (including TUSER and TLAST). Your Input
Memory module (Part 3) will serve as the receiver. When you simulate, the testbench will be the
transmitter for this input data. For details about how the input data are provided in this stream, and
how TUSER and TLAST are used to transmit sparse vectors, please see the Project Part 3 document.

6. Code and Report Submission

5 points will be awarded based on the quality of your code, comments, and report.

1. Code
For your code and synthesis reports, you will turn in a single .zip, .tar, or .tgz file to
Brightspace. Do not use a different archive format (e.g., .rar). Seriously, please do not
use any archive format except .zip, .tar, or .tgz or you will lose points.

This compressed file should hold all of the code and synthesis reports from your project,
organized into part1/ through part5/ directories. I will be testing your designs using
my testbenches, so it is very important that your code sticks to the specification closely. I
will test your designs using the ECE grad lab computers so make sure everything runs
correctly there.

Do not turn in things like QuestaSim “work” directories or gate-level Verilog produced by
synthesis. Please only submit your actual code.

2. Synthesis Reports
Include the DesignCompiler synthesis report (in plaintext format) for each design you
synthesized. These should be included in the .zip, .tar, or .tgz archive file mentioned above.
Make sure these reports are clearly labeled. Please include them in the appropriate part1/
part2/ part3/ part4/ or part5/ directory.

3. Report
Please organize your report neatly. Use headings to separate it into Part 1, Part 2, Part 3,
Part 4, and Part 5. Each part of the project will have a numbered list of questions you should
answer. In your report, please use the same numbering to make it easier to find your
answers. (In other words, number your answers to match the questions in this assignment.)

In addition to the code submission, your report should be submitted (as a PDF file only)
alongside of your.zip, .tar, or .tgz archive. (Please include the PDF report separately from
the archive.) If you worked with a partner, make sure you answered the questions in
each Part where you are asked to explain each partner’s contribution to the project.
(If you worked alone obviously you can skip this.)

4. Electronic Hand-in Process

To hand in your code, go to Brightspace à Assignments à Project. There you can upload
your .zip, .tar, or .tgz file and your PDF report. Only one partner should hand in for
the group, but make sure both partners’ names are clear in your code and report.

ESE 507 Project Overview

To create a .tgz file in Linux, first assemble a hand-in directory with copies of all of your
code, etc. For this example, let’s assume that directory is called handin/. Now, assuming
you are one directory above handin/, type the following:

tar cvzf myhandin.tgz handin/

This will create a gzipped-tar file (.tgz) that contains the entire handin/ directory
(including all of its contents).

You can test that it worked properly by copying the .tgz file you created to another
directory, and typing:
tar xvzf myhandin.tgz

This will extract the file into the directory you are currently in. If you have any problems
with this or anything else, please post them on Piazza.

Please, only use .zip, .tar, or .tgz files for your archive, and use PDF for your report. If you use
other formats, I will be unable to open your work on the lab computers, and you will lose points.
Your code archive should only contain your code and your synthesis reports with clearly labeled
names. Please do not submit the testbenches that were provided to you or other things like
QuestaSim “work” directories or gate-level Verilog produced by synthesis.

Juliet and the Two Talking Tennis Balls Who Made Her a World Champion!
From Everand
Juliet and the Two Talking Tennis Balls Who Made Her a World Champion!
Don DeNevi
No ratings yet
Crossed Wires: Team-Up
From Everand
Crossed Wires: Team-Up
Chad Rebmann
No ratings yet
The Adventures of Lizzy and Chuck
From Everand
The Adventures of Lizzy and Chuck
Maria Stanley
No ratings yet
Run to Win
From Everand
Run to Win
Eric D. Johnson
No ratings yet
Tales from William F. Nolan's Dark Universe
From Everand
Tales from William F. Nolan's Dark Universe
William F. Nolan
No ratings yet
10th use: Giant
From Everand
10th use: Giant
Darren G. Davis
No ratings yet
Baneberry Creek: Academy for Wayward Fairies #3
From Everand
Baneberry Creek: Academy for Wayward Fairies #3
CW Cooke
No ratings yet
Monster’s Among Us: A War of Witches
From Everand
Monster’s Among Us: A War of Witches
Andrew Shayde
No ratings yet
Flying Saucers Vs. the Earth #2
From Everand
Flying Saucers Vs. the Earth #2
Ryan Burton
No ratings yet
Simple Southern Recipes from Mother to Son
From Everand
Simple Southern Recipes from Mother to Son
Thomas H. Carroll IV
No ratings yet
Legend of Isis: Image Introduces
From Everand
Legend of Isis: Image Introduces
Darren G. Davis
No ratings yet
Blessed Days, Volume 6: Blessed Days, #6
From Everand
Blessed Days, Volume 6: Blessed Days, #6
Inky Moondrop
No ratings yet
Our Dream House
From Everand
Our Dream House
Janet Lombard Clements
No ratings yet
The Full Christmas Story
From Everand
The Full Christmas Story
Danny Haag
No ratings yet
Political Power: Ted Kennedy
From Everand
Political Power: Ted Kennedy
Brent Sprecher
No ratings yet
The Adventures of Eli and Jake
From Everand
The Adventures of Eli and Jake
Linda Hoffman
No ratings yet
Legend of Isis #10: Volume 2
From Everand
Legend of Isis #10: Volume 2
Aaron Stueve
No ratings yet
Blackbeard Legacy Gallery
From Everand
Blackbeard Legacy Gallery
Darren G. Davis
No ratings yet
Breastfeeding and Parenting: Your baby will teach you how
From Everand
Breastfeeding and Parenting: Your baby will teach you how
Sue Cox
No ratings yet
The Christmas Fish
From Everand
The Christmas Fish
Melba Harris
No ratings yet
Vincent Price Presents #04
From Everand
Vincent Price Presents #04
Chad Helder
No ratings yet
Sweet Pea Saves the Rainbows
From Everand
Sweet Pea Saves the Rainbows
Kim Berreckman
No ratings yet
Blackbeard Legacy #2 Volume 1
From Everand
Blackbeard Legacy #2 Volume 1
Darren G. Davis
No ratings yet
Odyssey Presents: Anthology #1
From Everand
Odyssey Presents: Anthology #1
Chad Rebmann
No ratings yet
From the Heart
From Everand
From the Heart
J. Bauman
No ratings yet
The Way of Courage
From Everand
The Way of Courage
Janet Hallagin
No ratings yet
Your Guide To: Fearless Entrepreneurship
From Everand
Your Guide To: Fearless Entrepreneurship
Nina Nova
No ratings yet
Tribute: Jerry Garcia
From Everand
Tribute: Jerry Garcia
Michael L. Frizell
No ratings yet
Three-Fold Cord: Creation Redemption Dominion
From Everand
Three-Fold Cord: Creation Redemption Dominion
Michael P Hays
No ratings yet
15 Minutes: Kim Kardashian
From Everand
15 Minutes: Kim Kardashian
Marc Shapiro
No ratings yet
Orbit: Mark Zuckerberg, Creator of Facebook
From Everand
Orbit: Mark Zuckerberg, Creator of Facebook
Jerome Maida
No ratings yet
Morals for Minions
From Everand
Morals for Minions
Dr. Debra Wilson
No ratings yet
Space Women Beyond the Stratosphere #3
From Everand
Space Women Beyond the Stratosphere #3
Scott Amundson
No ratings yet
Logan's Run: Aftermath #1
From Everand
Logan's Run: Aftermath #1
William F. Nolan
No ratings yet
Vincent Price Presents: Tinglers
From Everand
Vincent Price Presents: Tinglers
Mark L. Miller
No ratings yet
Growing Up on the Farm
From Everand
Growing Up on the Farm
Pamela Ingram May
No ratings yet
Odyssey Presents: Gallery
From Everand
Odyssey Presents: Gallery
Chad Rebmann
No ratings yet
Special and Different: The Autistic Traveler: Judgment, Redemption, & Victory
From Everand
Special and Different: The Autistic Traveler: Judgment, Redemption, & Victory
Steven Tomasino
No ratings yet
What Squirt Teaches Me about Jesus: Kids Learning about Jesus while Playing with Fido
From Everand
What Squirt Teaches Me about Jesus: Kids Learning about Jesus while Playing with Fido
Verneda S. Harris
No ratings yet
Violet Rose #0
From Everand
Violet Rose #0
Emma Davis
No ratings yet
Orbit: The Cast of Doctor Who #1
From Everand
Orbit: The Cast of Doctor Who #1
Paul J. Salamoff
5/5 (1)
Vincent Price Presents: Phibes
From Everand
Vincent Price Presents: Phibes
Mel Smith
No ratings yet
Orion the Hunter: Giant
From Everand
Orion the Hunter: Giant
Scott Davis
No ratings yet
Vincent Price Presents: Gallery #4
From Everand
Vincent Price Presents: Gallery #4
Joel Robinson
No ratings yet
Ruth & Freddy
From Everand
Ruth & Freddy
Bobby Breed
No ratings yet
Odyssey Presents: Anthology #2
From Everand
Odyssey Presents: Anthology #2
Chad Rebmann
No ratings yet
Flying Saucers Vs. the Earth #4
From Everand
Flying Saucers Vs. the Earth #4
Ryan Burton
No ratings yet
Mindfulness Full: Relaxing word search puzzles for adults that will keep your mind calm and positive, 50 brain teasers with more than 600 words
From Everand
Mindfulness Full: Relaxing word search puzzles for adults that will keep your mind calm and positive, 50 brain teasers with more than 600 words
Asomoo Ebooks
No ratings yet
Vincent Price Presents #20
From Everand
Vincent Price Presents #20
Mark L. Miller
No ratings yet
Legend of Isis Gallery #2
From Everand
Legend of Isis Gallery #2
Derek Ruiz
No ratings yet
Birds: Our Fine Feathered Friends: Seen by Sue and Drew
From Everand
Birds: Our Fine Feathered Friends: Seen by Sue and Drew
Gene Crumbley
No ratings yet
Adventure Awaits: Around the World
From Everand
Adventure Awaits: Around the World
Samuel West
No ratings yet
Female Force: Gabrielle Giffords
From Everand
Female Force: Gabrielle Giffords
CW Cooke
No ratings yet
Biblical Lessons from Grandpa: Preparing the Next Generations
From Everand
Biblical Lessons from Grandpa: Preparing the Next Generations
Michael F. Schmidt
No ratings yet
The Art of Southwest Landscaping
From Everand
The Art of Southwest Landscaping
Dawn Layna Fried
No ratings yet
Extreme Rhyming Poetry: Over 400 Inspirational Poems of Wit, Wisdom, and Humor (Five Books in One)
From Everand
Extreme Rhyming Poetry: Over 400 Inspirational Poems of Wit, Wisdom, and Humor (Five Books in One)
Darrell L. Price
No ratings yet
Flying Saucers Vs. the Earth #1
From Everand
Flying Saucers Vs. the Earth #1
Ryan Burton
No ratings yet
Primordia
From Everand
Primordia
John Fultz
No ratings yet
PAYBACK: Opportunity. Greed. Betrayal.
From Everand
PAYBACK: Opportunity. Greed. Betrayal.
John M. Capozzi
No ratings yet
Blackbeard Legacy #2 Volume 2
From Everand
Blackbeard Legacy #2 Volume 2
Eric Arvin
No ratings yet
Accelerating Matrix Multiplication With Block Sparse Format and NVIDIA Tensor Cores - NVIDIA Technical Blog
No ratings yet
Accelerating Matrix Multiplication With Block Sparse Format and NVIDIA Tensor Cores - NVIDIA Technical Blog
7 pages
ST M Hdstat RNN Deep Learning
No ratings yet
ST M Hdstat RNN Deep Learning
46 pages
Plenary - Willsky
No ratings yet
Plenary - Willsky
46 pages
CS304_M3
No ratings yet
CS304_M3
41 pages
A Quantitative Performance Analysis Model For GPU Architectures
No ratings yet
A Quantitative Performance Analysis Model For GPU Architectures
12 pages
CCS334 Bda
No ratings yet
CCS334 Bda
23 pages
MATLAB Notes
No ratings yet
MATLAB Notes
26 pages
Compressive Sensing Based Multi-User Detector For The Large-Scale SM-MIMO Uplink
No ratings yet
Compressive Sensing Based Multi-User Detector For The Large-Scale SM-MIMO Uplink
7 pages
DSA Unit2
No ratings yet
DSA Unit2
206 pages
SE Computer Engg. 2019 Patt - 01.072020
No ratings yet
SE Computer Engg. 2019 Patt - 01.072020
87 pages
StructuralComponents A CLIENT-SERVER SOFTWARE ARCHITECTURE FOR FEM-BASED STRUCTURAL DESIGN EXPLORATION
No ratings yet
StructuralComponents A CLIENT-SERVER SOFTWARE ARCHITECTURE FOR FEM-BASED STRUCTURAL DESIGN EXPLORATION
121 pages
LLM in A Flash: Efficient Large Language Model Inference With Limited Memory
No ratings yet
LLM in A Flash: Efficient Large Language Model Inference With Limited Memory
12 pages
Dynfluid JFM 2018 Loiseau
No ratings yet
Dynfluid JFM 2018 Loiseau
30 pages
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
No ratings yet
Experience of Developing Sparse Matrix Algorithms and Software For Sustainablity
22 pages
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
No ratings yet
Cini 2023 SparseGraphLearningFromSpatiotemporal Time Series
36 pages
ModifyingCompressedVoxels Main
No ratings yet
ModifyingCompressedVoxels Main
9 pages
Reshma Project Report
No ratings yet
Reshma Project Report
47 pages
Intro to Programming with Python - Final exam Practice Fall 2024
No ratings yet
Intro to Programming with Python - Final exam Practice Fall 2024
4 pages
Gundersen 2004
No ratings yet
Gundersen 2004
17 pages
IFEM Ch08
No ratings yet
IFEM Ch08
16 pages
Department of Information Technology: Data Structure Semester IV (4IT01) Question Bank Prepared by Prof. Ankur S. Mahalle
100% (1)
Department of Information Technology: Data Structure Semester IV (4IT01) Question Bank Prepared by Prof. Ankur S. Mahalle
13 pages
DSTC Unit-I
No ratings yet
DSTC Unit-I
16 pages
Risk-Constrained FTR Bidding Strategy in Transmission Markets
No ratings yet
Risk-Constrained FTR Bidding Strategy in Transmission Markets
8 pages
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
No ratings yet
Linear Algebra and Applications: Numerical Linear Algebra: David S. Watkins
107 pages
Principios de Probabilidad Variables Aleatorias Peyton Peebles
No ratings yet
Principios de Probabilidad Variables Aleatorias Peyton Peebles
15 pages
Programming in Scilab
No ratings yet
Programming in Scilab
155 pages
DS Module 1 Notes
No ratings yet
DS Module 1 Notes
58 pages
Full Download Tensor Computation for Data Analysis Yipeng Liu PDF DOCX
100% (4)
Full Download Tensor Computation for Data Analysis Yipeng Liu PDF DOCX
50 pages
IMSL C Numerical Library PDF
No ratings yet
IMSL C Numerical Library PDF
59 pages

Proj Overview

Uploaded by

Proj Overview

Uploaded by

ESE-507 Advanced Digital System Design and Generation

Page 1 © 2023 Peter Milder

Page 2 © 2023 Peter Milder

y[m] = W [m][n] · x[n]

3.2 Matrix-Vector Multiplication with Generalized Dimensions

Or, in array notation:

y[m] = W [m][n] · x[n], for m = 0, . . . , M 1

for m = 0 ... M-1:

3.3 Matrix-Sparse Vector Multiplication (MSpVM)

Page 3 © 2023 Peter Milder

for m = 0 ... M-1:

Page 4 © 2023 Peter Milder

4. High-Level Project Overview

clk OUTPUT_TVALID AXIS output

Figure 1. Top-level Design and Port Specifications.

matrix and output

mem. rd. addrs Control control and status

Figure 2. High-Level Block Diagram

Page 5 © 2023 Peter Milder

• A multiply-accumulate (MAC) unit will be used to perform the individual multiplications

Page 6 © 2023 Peter Milder

5. AXI-Stream Input/Output Protocol

Transmitter TREADY Receiver

Page 7 © 2023 Peter Milder

TDATA x d[0] d[1] d[2] d[3]

Figure 4. AXI-Stream data transfer timing example.

cycle # TVALID TREADY Explanation

Table 1. AXI-Stream data transfer timing example.

Page 8 © 2023 Peter Milder

Additional AXI-Stream Signals

Transmitter TDATA Receiver

Page 9 © 2023 Peter Milder

6. Code and Report Submission

4. Electronic Hand-in Process

Page 10 © 2023 Peter Milder

tar cvzf myhandin.tgz handin/

Page 11 © 2023 Peter Milder

You might also like