207 Assignment 6
207 Assignment 6
10869753
Solution
CPI = 45000 + (2*32000) + (2*15000) + (8000*2) / (100 000) = 155 / 100 = 1.55
Execution time = (100 000 instructions) * 1.55 CPI = 155 000 cycles
=2.22
For machine *
Even though machine b has a higher MIOS, it needs a longer execution time to execute a similar set of
insrtructions.
2.8
A processor accesses main memory with an average access time of A smaller cache memory is
interposed between the processor and main memory. The cache has a significantly faster access time of
The cache holds, at any time, copies of some main memory words and is designed so that the words
more likely to be accessed in the near future are in the cache. Assume that the probability that the next
word accessed by the processor is in the cache is H, known as the hit ratio. a. For any single memory
access, what is the theoretical speedup of accessing the word in the cache rather than in main memory?
b. Let T be the average access time. Express T as a function of and H. What is the overall speedup as a
function of H? c. In practice, a system may be designed so that the processor must first access the cache
to determine if the word is in the cache and, if it is not, then access main memory, so that on a miss
(opposite of a hit), memory access time is Express T as a function of and H. Now calculate the speedup
and compare to the result produced in part (b)
Solution
= T2/T1
T = H × T1 + (1 – H) × T2
= T2 / T = T2 / (H × T1 + (1− H)T2)
Problem 3.1
Step 1: 3005 → IR
Step 2: 3 → AC
Step 3: 5940 → IR
Step 4: 3 + 2 = 5 → AC
Step 5: 7006 → IR
Step 6: AC → Device 6
Problem 3.3
Consider a hypothetical 32-bit microprocessor having 32-bit instructions composed of two fields: the
first byte contains the opcode and the remainder the immediate operand or an operand address.
b. Discuss the impact on the system speed if the microprocessor bus has
c. How many bits are needed for the program counter and the instruction register?
Solution
b.1. a 32-bit local address bus and a 16-bit local data bus. Instruction and data transfers would take
three bus cycles each, one for the address and two for the data. If the address bus is 32 bits, the whole
address can be transferred to memory at once and decoded there; however, since the data bus is only
16 bits, it will require 2 bus cycles (accesses to memory) to fetch the 32-bit instruction or operand.
b.2. a 16-bit local address bus and a 16-bit local data bus. Instruction and data transfers would take four
bus cycles each, two for the address and two for the data. Therefore, that will have the processor
perform two transmissions in order to send to memory the whole 32-bit address; this will require more
complex memory interface control to latch the two halves of the address before it performs an access to
it. In addition to this two-step address issue, since the data bus is also 16 bits, the microprocessor will
need 2 bus cycles to fetch the 32-bit instruction or operand.
c. For the PC needs 24 bits (24-bit addresses), and for the IR needs 32 bits (32-bit addresses).
doubling the bus wait states for read and write operations:
Original cycles = fetch opcode (4) + fetch operand address (3) + fetch operand(3) + increment (3) + store
(3)
increasing two bus wait states for read and write:
= 8+6+6+6+6 = 29 cycles.
increase in percentage = (29-16)/16 = 13/16 = 0.8125
= 81.25%
b)
increase increment operation to 13 cycles..
= 8+6+6+13+6 = 39 = (39-16)/16 = 23/16
= 143.75%
Problem 4.1
Problem 4.2
Program A is better than B. Even though they perform the sum of the square of the absolute difference
between two sequence; even though they seem to be same the performance
Z[i]=X[i]-Y[i]
Z[i]=Z[i]*Z[i]
both these statements are executed same no of times in both the program
but the condition i<n is checked for n times in program A whereas it is checked
2 times in program B because of the 2 for loops
T2s = 1.5 +(1 - 0.97 ) x T2 = 1.5 +0.03T2
T2 > 25ns
Increasing hit ratio to 0.97 at the cost of increasing cache access time to 1.5 will improve average
memory access time in the case that main memory access time is larger than 25 ns. T2 > 25 n
Problem 5.1
Problem 5.9
Block size = 8 bytes. , A byte in a block is represented with 3 bits. So Byte number field requires 3 bits
Number of cache lines = 32 , so line number in cache can be represented with 5 bits.
Byte No Line No
c)
So when byte 6682 is in cache memory, along with it 7 other bytes are in cache memory.( All 8 bytes of a
block are stored in cache line)
0001 1010 0001 1000
d)
Total bytes that can be stored in cache memory = No. of cache lines * No of bytes in a block(or line)
= 32 * 8
=256
Problem 5.13
no of sets=211
ie 2048