0% found this document useful (0 votes)

14 views

Exam2 Practice Sol

Uploaded by

Ishan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

Exam2 Practice Sol

Uploaded by

Ishan

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 6

Question 1: Single-cycle CPU implementation (40 points)

On the last page of the exam is a single-cycle datapath for a machine very different than the one we saw in
lecture. It supports the following (complex) instructions:

lw_add rd, (rs), rt # rd = Memory[R[rs]] + R[rt];

addi_st (rs), rs, imm # Memory[R[rs]] = R[rs] + imm;
sll_add rd, rs, rt, imm # rd = (R[rs] << imm) + R[rt];

All instructions use the same format (shown below), but not all instructions use all of the fields.

Field op rs rt rd imm
Bits 31-26 25-21 20-16 15-11 10-0

Part (a)
For each of the above instructions, specify how the control signals should be set for correct operation. Use X for
don’t care. ALUOp can be ADD, SUB, SLL, PASS_A, or PASS_B (e.g., PASS_A means pass through the top
operand without change). Full points will only be awarded for the fastest implementation. (20 points)

inst ALUsrc1 ALUsrc2 ALUsrc3 ALUop1 ALUop2 MemRead MemWrite RegWrite

lw_add X 1 0 X ADD 1 0 1

addi_st 1 0 X ADD X 0 1 0

sll_add 1 1 1 SLL ADD 0/X 0 1

Func. Unit Latency

Part (b)
Given the functional unit latencies as shown to the right, compute the minimum Memory 3 ns
time to perform each type of instruction. Explain. (15 points) ALU 4 ns
Register File 2 ns

inst Minimum time Explain

lw_add 14ns IMEM (3ns) + RF_read (2ns) + DMEM (3ns) + ALU (4ns) + RF_write (2ns)

addi_st 12ns IMEM (3ns) + RF_read (2ns) + ALU (4ns) + DMEM (3ns)

sll_add 15ns IMEM (3ns) + RF_read (2ns) + ALU (4ns) + ALU (4ns) + RF_write (2ns)

Part (c)
What is the CPI and cycle time for this processor? (5 points)

Since the processor is a single-cycle implementation, the CPI is 1. The cycle time is set by the slowest
instruction, which in this case is the sll_add, yielding a clock period of 15ns.
1
Question: Pipelining (45 points)

Part (a)
Give a non-computing example of pipelining not involving laundry. (5 points)

Bucket brigages, Subway sandwich assembly, airport security, fast-food restaurant drive throughs.

Part (b)
Assume you have 10 items to process. If you can pipeline the processing into 5 steps (perfectly balancing the
pipeline stages), how much faster does pipelining enable you to complete the process? You can leave your
answer as an expression. (10 points)

It takes 4 steps to fill the pipeline and then one item is completed each cycle thereafter. The cycle time
should be 1/5th as long. So…

Speedup = (old exec time)/(new exec time) = 10 / (14 * 1/5) = 50/14 = ~3.57 times faster.

2
Question, continued
Consider the following MIPS code:
loop: sub $t3, $t3, $t0
add $t0, $t0, $t0
addi $t1, $t0, 5
lw $t2, 0($t0)
add $t2, $t2, 1
sw $t2, 0($t1)
bgt $t3, $zero, loop

Part (c) Label all dependences within one iteration of this code (not just the ones that will require forwarding).
One iteration is defined as the instructions between the sub and bgt inclusive. (10 points)

Part (d) The above code produces the pipeline diagram shown below when run on a 5-stage MIPS pipeline
with stages IF (fetch), ID (decode), EX (execute), MEM (memory) and WB (write-back).
Note: stalls are indicated by –, and registers can be read in the same cycle in which they are written.

Inst iter 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
sub N IF ID EX MEM WB

add N IF ID EX MEM WB

addi N IF ID - EX MEM WB

lw N IF - ID EX MEM WB

add N IF ID - EX MEM WB

sw N IF - ID EX MEM WB

bgt N - IF ID EX MEM WB

sub N+1 - - IF ID EX MEM WB

For each of the following questions, justify your answer or explain why it cannot be answered using the given
information. Note: full credit requires explanations. (20 points)
Is forwarding from EX/MEM to the EX stage implemented? Explain.
No, because the addi instruction is stalled.

Is forwarding from MEM/WB to the EX stage implemented? Explain.

Yes, because the addi is stalled only one cycle and the add following the lw is stalled only one cycle.

Note: The branch target is computed in the ID stage. In which stage are branches resolved? Explain.
Branches are resolved in the EX stage, because the second sub is fetched the cycle after the branch is in
EX.

The branch target is computed in the ID stage. What is the branch prediction scheme? Explain.
If the diagram shows all of the instructions fetched, then a no-prediction policy is used. Can’t be predict
not-taken because those instructions aren’t shown. Can’t be predict taken because the sub would be
fetched in cycle 11.
3
Question: Interrupt Handler Routine (25 points)
For the following questions, refer to the interrupt/exception handler code on the last page of this exam. Your
answers should be no more that two sentences.

Part (a)
Explain the role of the statement labeled by A. Why is this necessary given that $at is not referenced
elsewhere in the code? (5 points)

The assembler temporary $at may be used by any psuedo-instructions in the interrupt handler. We
need to save and restore it so that the program’s state will not be overwritten.

Part (b)
MIPS functions preserve callee-saved registers on the stack. In contrast, the interrupt handler saves registers in
the .data segment (e.g., in save0 and save1 in part B of the given code). Why is the stack not used in the
interrupt handler? Also, why is the stack used to preserve registers in MIPS functions? (5 points)

The interrupt handler can’t make assumptions about where the application has used the stack, so it
would risk overwriting user code. MIPS functions use a stack because it permits multiple invocations of
the same function to be overlapped (e.g., recursion).

Part (c)
Why does the line labeled with C jump to “interrupt dispatch” and not “done” ? (5 points)

To check whether additional interrupts have been received while processing this interrupt. It is a
performance optimization when interrupt rates are high to avoid the overhead of having to jump in and
out of the interrupt handler (i.e., all of that saving and restore of registers).

Part (d)
Explain the line labeled by D. (5 points)

By looking at how register $k0 is used (as the return from interrupt address), coprocessor register $14
must be the location where the program’s PC at the time of the interrupt is stored. The move-from-
coprocessor0 (mfc0) instruction copies it to a general purpose register.

Part (e)
What code must be executed in order to reach the interrupt/exception handler? An explanation is sufficient;
we don’t need to see the exact code. (5 points)

We need to enable interrupts. This involves setting both the global interrupt enable bit and the bits
associated with the particular interrupts we want to receive. This can be done by writing to a
coprocessor register.

4
Question 2, Interrupt Handler Routine

interrupt_handler:
.set noat
move $k1, $at A
.set at
sw $a0, save0
sw $a1, save1 B
mfc0 $k0, $13 # Get Cause register
srl $a0, $k0, 2
and $a0, $a0, 0xf # ExcCode field
bne $a0, 0, non_intrpt

interrupt_dispatch:
mfc0 $k0, $13
beq $k0, $zero, done

and $a0, $k0, 0x1000

bne $a0, 0, bonk_interrupt

li $v0, 4
la $a0, unhandled_str
syscall
j done

bonk_interrupt:
… # turn and set speed.
sw $a1, 0xffff0060($zero) # acknowledge interrupt
j interrupt_dispatch

non_intrpt:
C
li $v0, 4
la $a0, non_intrpt_str
syscall
j done

done:
lw $a0, save0
lw $a1, save1
mfc0
.set noat
$k0, $14 D
move $at $k1
.set at
rfe
jr $k0
nop

5
Question 3: Concepts (25 points)

Write a short answer to the following questions. For full credit, answers should not be longer than two
sentences.

Part a) A program is known to have a serial component that takes S seconds to execute, but the rest of the
computation is arbitrarily parallelizable. If the program runs in T seconds using P processors, how long would
it take to run using X processors? Assume T > S and P > 1. (10 points)

This is just an application of Amdahl’s law. The parallelizable portion (T-S) of the program was sped up
by a factor of P. Thus, the serial time is: S + (T-S)*P

When running on X processors, we divide the parallelizable portion by X or: S + (T-S)*P/X

Part b) Consider the following program: main(){

func(5);
}
If we forgot to implement the function “func” would an error be raised by the compiler, assembler, linker, or
loader? Explain. (5 points)

It would be detected by the linker. The compiler and the assembler will happily do their work, assuming
that the function is implemented in another source file. When the linker runs, it realizes that you
haven’t provided an implementation. Try it yourself:
gcc -S func.c
gcc -c func.s
gcc func.o

Part c)
In what circumstances is throughput the desired performance metric and in what circumstances is latency the
desired metric? (5 points)

Latency is desired when we are concerned about one thing. Throughput is desired when we care about
the rate at which things are processed.

Part c)
Which of the three factors of CPU time can a compiler influence? Provide examples. (5 points)

- The compiler can affect the number of instructions a program executes (e.g., by allocating a variable to
a register it can avoid load and store instructions to access that variable).
- The compiler can affect the CPI by selecting “easier” instructions (e.g., implementing a division by 2 as
a right shift by 1 instead of using the integer division instruction)
- Typically a compiler has no direct influence on a processor’s clock frequency.

2324sem 1-CS2100
No ratings yet
2324sem 1-CS2100
14 pages
IT3030E Exercise Chap5 v2 Ans
No ratings yet
IT3030E Exercise Chap5 v2 Ans
11 pages
Sample Final Exam EECS388 - Fall 2020
No ratings yet
Sample Final Exam EECS388 - Fall 2020
19 pages
AVAYA User Manual
No ratings yet
AVAYA User Manual
1,658 pages
Midtermarch 2
No ratings yet
Midtermarch 2
9 pages
CMPE361-Final - Sanple
No ratings yet
CMPE361-Final - Sanple
8 pages
Illinois Exam2 Practice Solfa08
No ratings yet
Illinois Exam2 Practice Solfa08
4 pages
CS433 hw1 Fall 07
No ratings yet
CS433 hw1 Fall 07
3 pages
COE301 Final Solution 162
No ratings yet
COE301 Final Solution 162
10 pages
6.823 Computer System Architecture Quiz 1
No ratings yet
6.823 Computer System Architecture Quiz 1
15 pages
111 Computer Organization - Quiz 2
No ratings yet
111 Computer Organization - Quiz 2
3 pages
Quiz For Chapter 2 With Solutions
100% (1)
Quiz For Chapter 2 With Solutions
7 pages
2022HI400070G Nivedita
No ratings yet
2022HI400070G Nivedita
27 pages
Cs433 Fa20 Hw3 Solution
No ratings yet
Cs433 Fa20 Hw3 Solution
15 pages
CCEE 213_2006_2007_II_Final
No ratings yet
CCEE 213_2006_2007_II_Final
10 pages
F10 E1 Solution
No ratings yet
F10 E1 Solution
5 pages
Indian Institute of Technology, Kharagpur: Mid-Spring Semester 2021-22
No ratings yet
Indian Institute of Technology, Kharagpur: Mid-Spring Semester 2021-22
4 pages
CENG400-Midterm-Fall 2014
No ratings yet
CENG400-Midterm-Fall 2014
9 pages
Inf2c Cs 201314
No ratings yet
Inf2c Cs 201314
10 pages
2005 Computer Architecture Solutions
No ratings yet
2005 Computer Architecture Solutions
11 pages
hw2 Carch 2024 Sol
No ratings yet
hw2 Carch 2024 Sol
8 pages
PDF 1
No ratings yet
PDF 1
17 pages
Inf2c Cs 201112
No ratings yet
Inf2c Cs 201112
9 pages
Inf2c Cs 201213
No ratings yet
Inf2c Cs 201213
9 pages
Past Papers
No ratings yet
Past Papers
7 pages
Questions
No ratings yet
Questions
4 pages
cs146 Fall2017 Midterm1xx
No ratings yet
cs146 Fall2017 Midterm1xx
12 pages
PS4-Solution
No ratings yet
PS4-Solution
6 pages
SHEET 9
No ratings yet
SHEET 9
12 pages
PDF 2
No ratings yet
PDF 2
13 pages
Midsem Final
No ratings yet
Midsem Final
3 pages
Pipe 4
No ratings yet
Pipe 4
50 pages
Cse590490 HW2
No ratings yet
Cse590490 HW2
5 pages
Compre_23
No ratings yet
Compre_23
3 pages
pipe2New
No ratings yet
pipe2New
41 pages
Microprocessor & Alp
No ratings yet
Microprocessor & Alp
43 pages
Cheat Sheet
100% (3)
Cheat Sheet
3 pages
Basics of Electronics and Microprocessor MCQ's
75% (4)
Basics of Electronics and Microprocessor MCQ's
16 pages
CAO EST Solution 2022
No ratings yet
CAO EST Solution 2022
8 pages
Coa Applied
No ratings yet
Coa Applied
13 pages
hw5 Soln
No ratings yet
hw5 Soln
4 pages
Ics233finalsol 072
No ratings yet
Ics233finalsol 072
16 pages
Inf2c Cs 201314 Resit
No ratings yet
Inf2c Cs 201314 Resit
7 pages
Tut10 Selected Ans
No ratings yet
Tut10 Selected Ans
7 pages
Tutorial Module 4
No ratings yet
Tutorial Module 4
9 pages
BFE Final Organization Fall 2014 Answer
No ratings yet
BFE Final Organization Fall 2014 Answer
8 pages
MCQ Qbank - 2023 24
No ratings yet
MCQ Qbank - 2023 24
8 pages
Lecture10 - chapter4-p2
No ratings yet
Lecture10 - chapter4-p2
46 pages
2018 Second
No ratings yet
2018 Second
7 pages
03 - CPU Memory Program Execution Assembly - Exercise Sheet (Solutions) - 1504219614
No ratings yet
03 - CPU Memory Program Execution Assembly - Exercise Sheet (Solutions) - 1504219614
6 pages
111 Computer Organization - Final
No ratings yet
111 Computer Organization - Final
4 pages
Your Name:: Final Exam
No ratings yet
Your Name:: Final Exam
9 pages
CSGC 342
No ratings yet
CSGC 342
7 pages
CA_HW5 copy
No ratings yet
CA_HW5 copy
4 pages
CS104: Computer Organization: 30 March, 2020
No ratings yet
CS104: Computer Organization: 30 March, 2020
31 pages
Quiz2 Soln spr12 PDF
No ratings yet
Quiz2 Soln spr12 PDF
2 pages
Sample Problems Pipe&Memory
No ratings yet
Sample Problems Pipe&Memory
57 pages
pipe3
No ratings yet
pipe3
32 pages
C Programming
From Everand
C Programming
Netra
No ratings yet
Digital Circuit Simulation Using Excel
From Everand
Digital Circuit Simulation Using Excel
Anthony Mazzurco
No ratings yet
Amazing Java: Learn Java Quickly
From Everand
Amazing Java: Learn Java Quickly
Andrei Besedin
No ratings yet
Intel Architecture Day 2021 Presentation
No ratings yet
Intel Architecture Day 2021 Presentation
195 pages
Gpu Schedulers
No ratings yet
Gpu Schedulers
38 pages
New Trends in Mobile Computing Architecture
No ratings yet
New Trends in Mobile Computing Architecture
8 pages
Operating System Updates For CPU 315-2DP
No ratings yet
Operating System Updates For CPU 315-2DP
6 pages
3 UG MUNDARI Honours
No ratings yet
3 UG MUNDARI Honours
58 pages
8085 Data-Transfer Instructions
No ratings yet
8085 Data-Transfer Instructions
5 pages
BEE403
No ratings yet
BEE403
2 pages
STM32F446-ARM Nucleo Board User's Manual: D.K. Blandford August, 2017 Updated November 3, 2017
No ratings yet
STM32F446-ARM Nucleo Board User's Manual: D.K. Blandford August, 2017 Updated November 3, 2017
60 pages
Motherboards Soyo
No ratings yet
Motherboards Soyo
1 page
EET 2262 Squad 3 Lab 1 PDF
No ratings yet
EET 2262 Squad 3 Lab 1 PDF
9 pages
OS Lab Record
No ratings yet
OS Lab Record
45 pages
embOS CortexM IAR
No ratings yet
embOS CortexM IAR
82 pages
Computer Architecture: Components Organization Working
No ratings yet
Computer Architecture: Components Organization Working
5 pages
Cmts Unit 1
No ratings yet
Cmts Unit 1
18 pages
CHP 5 Pic Micro Controller Instruction Set
100% (1)
CHP 5 Pic Micro Controller Instruction Set
75 pages
DigiTitans 4
No ratings yet
DigiTitans 4
54 pages
VuQuangAnh Bai4
No ratings yet
VuQuangAnh Bai4
8 pages
Unit 1: Introduction To Computers: Lesson 1: Introduction and Basic Organization
No ratings yet
Unit 1: Introduction To Computers: Lesson 1: Introduction and Basic Organization
16 pages
Chap1 Introduction 2021 in
No ratings yet
Chap1 Introduction 2021 in
12 pages
1.01 Hardware and Software
No ratings yet
1.01 Hardware and Software
15 pages
Libertad National High School: Libertad, Bunawan, Agusan Del Sur
No ratings yet
Libertad National High School: Libertad, Bunawan, Agusan Del Sur
51 pages
Low-Level Programming Languages and Pseudocode
No ratings yet
Low-Level Programming Languages and Pseudocode
41 pages
DOC-20241128-WA0003.
No ratings yet
DOC-20241128-WA0003.
28 pages
Sak c167cs LM
No ratings yet
Sak c167cs LM
80 pages
Types of Operating Systems - Shiksha Online
No ratings yet
Types of Operating Systems - Shiksha Online
9 pages
Computer Software PDF 2
No ratings yet
Computer Software PDF 2
2 pages
Stefano Markidis, Erwin Laure - Solving Software Challenges For Exascale 2015
No ratings yet
Stefano Markidis, Erwin Laure - Solving Software Challenges For Exascale 2015
154 pages
Learn Computer Science
100% (2)
Learn Computer Science
24 pages
Test Series I
No ratings yet
Test Series I
14 pages

Exam2 Practice Sol

Uploaded by

Exam2 Practice Sol

Uploaded by

Question 1: Single-cycle CPU implementation (40 points)

lw_add rd, (rs), rt # rd = Memory[R[rs]] + R[rt];

inst ALUsrc1 ALUsrc2 ALUsrc3 ALUop1 ALUop2 MemRead MemWrite RegWrite

sll_add 1 1 1 SLL ADD 0/X 0 1

Func. Unit Latency

inst Minimum time Explain

sub N+1 - - IF ID EX MEM WB

Is forwarding from MEM/WB to the EX stage implemented? Explain.

and $a0, $k0, 0x1000

When running on X processors, we divide the parallelizable portion by X or: S + (T-S)*P/X

Part b) Consider the following program: main(){

You might also like