0% found this document useful (0 votes)
17 views

Assignment1 4

Uploaded by

Munim Dheeman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Assignment1 4

Uploaded by

Munim Dheeman
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CSC 350: Computer Architecture

Spring 2024

Assignment #1

Due: February 12, 11:59 PM

Note: Reference book of the course is used for questions (i.e., “Computer Organization and
Design, MIPS Edition, The Hardware/Software Interface”, Sixth Edition, by David A. Patterson &
John L. Hennessy).

Chapter1

Q1: (10 points)

The seven great ideas in computer architecture are similar to ideas from other fields. Match the
seven ideas from computer architecture, “Use Abstraction to Simplify Design”, “Make the
Common Case Fast”, “Performance via Parallelism”, “Performance via Pipelining”, “Performance
via Prediction”, “Hierarchy of Memories”, and “Dependability via Redundancy” to the following
ideas from other fields:

A. Assembly lines in automobile manufacturing


B. Suspension bridge cables
C. Aircraft and marine navigation systems that incorporate wind information
D. Express elevators in buildings
E. Library reserve desk
F. Increasing the gate area on a CMOS transistor to decrease its switching time
G. Building self-driving cars whose control systems partially rely on existing sensor systems
already installed into the base vehicle, such as lane departure systems and smart cruise
control systems

Q2: (30 points)

Consider three different processors P1, P2, and P3 executing the same instruction set. P1 has a 3
GHz clock rate and a CPI of 1.5. P2 has a 2.5 GHz clock rate and a CPI of 1.0. P3 has a 4.0 GHz clock
rate and has a CPI of 2.2.

A. Which processor has the highest performance expressed in instructions per second?
B. If the processors each execute a program in 10 seconds, find the number of cycles and
the number of instructions.
C. We are trying to reduce the execution time by 30% but this leads to an increase of 20% in
the CPI. What clock rate should we have to get this time reduction?
CSC 350: Computer Architecture
Spring 2024

Assignment #1

Due: February 12, 11:59 PM

Q3: (20 points)

Assume for arithmetic, load/store, and branch instructions, a processor has CPIs of 1, 12, and 5,
respectively. Also assume that on a single processor a program requires the execution of 2.56E9
arithmetic instructions, 1.28E9 load/store instructions, and 256 million branch instructions.
Assume that each processor has a 2 GHz clock frequency.

Assume that, as the program is parallelized to run over multiple cores, the number of arithmetic
and load/store instructions per processor is divided by 0.7 × p (where p is the number of
processors) but the number of branch instructions per processor remains the same.

A. Find the total execution time for this program on 1, 2, 4, and 8 processors, and show the
relative speedup of the 2, 4, and 8 processor result relative to the single processor
result.
B. If the CPI of the arithmetic instructions was doubled, what would the impact be on the
execution time of the program on 1, 2, 4, or 8 processors?
C. To what should the CPI of load/store instructions be reduced in order for a single
processor to match the performance of four processors using the original CPI values?

Q4: (30 points)

Assume a program requires the execution of 50 × 106 FP instructions, 110 × 106 INT
instructions, 80 × 106 L/S instructions, and 16 × 106 branch instructions. The CPI for each type
of instruction is 1, 1, 4, and 2, respectively. Assume that the processor has a 2 GHz clock rate.

A. By how much must we improve the CPI of FP instructions if we want the program to run
two times faster?
B. By how much must we improve the CPI of L/S instructions if we want the program to run
two times faster?
C. By how much is the execution time of the program improved if the CPI of INT and FP
instructions is reduced by 40% and the CPI of L/S and Branch is reduced by 30%?
CSC 350: Computer Architecture
Spring 2024

Assignment #1

Due: February 12, 11:59 PM


Chapter2

Q5: (10 points)

Show how the value 0xabcdef12 would be arranged in memory of a little-endian and a big-endian
machine. Assume the data are stored starting at address 0 and that the word size is 4 bytes.

Q6: (20 points)

Assume that $s0 holds the value 128ten.


A. For the instruction add $t0, $s0, $s1, what is the range(s) of values for $s1 that would
result in overflow?
B. For the instruction sub $t0, $s0, $s1, what is the range(s) of values for $s1 that would
result in overflow?
C. For the instruction sub $t0, $s1, $s0, what is the range(s) of values for $s1 that would
result in overflow?

Q7: (20 points)

Find the shortest sequence of MIPS instructions that extracts bits 16 down to 11 from register
$t0 and uses the value of this field to replace bits 31 down to 26 in register $t1 without changing
the other bits of registers $t0 and $t1 (Be sure to test your code using $t0 = 0 and $t1 =
0xffffffffffffffff . Doing so may reveal a common oversight.).
CSC 350: Computer Architecture
Spring 2024

Assignment #1

Due: February 12, 11:59 PM

Q8: (10 points)

Translate the following loop into C. Assume that the C-level integer i is held in register $t1, $s2
holds the C-level integer called result, and $s0 holds the base address of the integer MemArray.

addi $t1, $0, 0


LOOP: lw $s1, 0($s0)
add $s2, $s2, $s1
addi $s0, $s0, 4
addi $t1, $t1, 1
slti $t2, $t1, 100
bne $t2, $s0, LOOP

Q9: (20 points)

Translate function f into MIPS assembly language. If you need to use registers $t0 through $t7,
use the lower-numbered registers first. Assume the function declaration for func is “int f(int a,
int b);”. The code for function f is as follows:

int f(int a, int b, int c, int d)


{
return func(func(a,b),c + d);
}
CSC 350: Computer Architecture
Spring 2024

Assignment #1

Due: February 12, 11:59 PM

Q10: (30 points)

Assume for a given processor the CPI of arithmetic instructions is 1, the CPI of load/store
instructions is 10, and the CPI of branch instructions is 3. Assume a program has the following
instruction breakdowns: 500 million arithmetic instructions, 300 million load/store instructions,
100 million branch instructions.

A. Suppose that new, more powerful arithmetic instructions are added to the instruction
set. On average, through the use of these more powerful arithmetic instructions, we can
reduce the number of arithmetic instructions needed to execute a program by 25%, and
the cost of increasing the clock cycle time by only 10%. Is this a good design choice? Why?
B. Suppose that we find a way to double the performance of arithmetic instructions. What
is the overall speedup of our machine? What if we find a way to improve the performance
of arithmetic instructions by 10 times?

You might also like