Maths
Maths
You have been asked to design a cache with the following properties:
• Data words are 32 bits each
• A cache block will contain 2048 bits of data
• The cache is direct mapped
• The address supplied from the CPU is 32 bits long
• There are 2048 blocks in the cache
• Addresses are to the word addressable.
There are 211 bits / block and there are 25 bits / word.
Thus there are 26 words / block, so we need 6 bits of offset.
There are 211 blocks and the cache is direct mapped. Therefore, we need 11 bits of
index.
Tag field size = 32-(6+11) = 15 bits
total size of the cache
2048 blocks in the cache and there are 2048 bits / block.
So, 256 bytes / block
Hence, 2048 blocks x 256 bytes / block = 219 bytes (or 0.5 MB).
2. Letʼs consider the previous Scenario once again, what happens if we make our cache
2-way set-associative instead of direct mapped.
3. letʼs consider what happens if data is byte addressable, and the cache organization
is 4-way set associative for the 1st question.
Number of bits in offset? –
There were 6 bits in the offset - Now, each of the 4 bytes of a given word can be
individually addressed
Therefore, we need 2 more bits of address per word –
Thus, 26 words * 22 bytes / word = 28
8 bits of offset are needed.
Number of bits in index?
211 blocks /( 22 blocks/set) = 29 sets o (9 bits of index needed)
Number of bits in tag? - 32 – 8 – 9 = 15 bits
4. Consider CPI for ALU, Load/Store , Branch and Jump are 1, 1.5, 1.5 and 1
respectively. instruction mix of 40% ALU and logical operations, 30% load and
store, 20% branch and 10% jump instructions.4-way set associative cache with
separate data and instruction cache miss rate of 0.20 for data and miss rate of 0.10
for instructions, miss penalty of 80 cycles and 50 cycles for data and instruction
cache respectively (and assume a cache hit takes 1 cycle) . What is the effective CPU time
(or effective CPI with memory stalls) and the average memory access time for this
application with this Cache organization?
total access to memory during program execution = 1(for hit)+ 0.3(for load/Store)
AMAT = (0.3/1.3) *(1 + .2 * 80) + (1/1.3)*(1+.1*50)
5. Calculate the loading time difference between no memory interleaving and 4-modules memory
interleaving when a cache read miss occurs where main memory have to be accessed and
subsequent transfer data to the cache. – Size of block needed to transfer from memory to cache
= 8 words – Access time for main memory (1st word) = 8 cycles/word – Access time for main
memory (2nd to 8th word) = 4 cycles/word (No address decoding is necessary for same memory
module.) – Access/transfer time from main memory to cache = 1 cycle/word.
No memory interleaving :
Loading time = cache miss + (1st word + 2nd to 8th word) + 1 cache transfer
= 1 + 8 + (7 × 4) + 1 = 38 cycles
Memory interleaving :
Loading time = cache miss + (1st word in 4 modules + 2nd word in 4 modules) + 4 cache transfers
= 1 + (8 + 4) + 4 = 17 cycles
6. Consider a computer with the following characteristics: total of 1Mbyte of main memory; word
size of 1 byte; block size of 16 bytes; and cache size of 64 Kbytes. For the main memory addresses
of F0010 and CABBE, give the corresponding tag and offset values for a fully-associative cache.
With a fully associative cache, the cache is split up into a TAG and a WORDOFFSET field. We no
longer need to identify which line a memory block might map to, because a block can be in any
line and we will search each cache line in parallel. The word-offset must be 4 bits to address each
individual word in the 16-word block. This leaves 16 bits leftover for the tag.
F0010
Word offset = 0h
Tag = F001h
CABBE
Word offset = Eh
Tag = CABBh
7. Consider an Intel P4 microprocessor with a 16 Kbyte unified L1 cache. The miss rate for this cache
is 3% and the hit time is 2 CCs. The processor also has an 8 Mbyte, on-chip L2 cache. 95% of the
time, data requests to the L2 cache are found. If data is not found in the L2 cache, a request is
made to a 4 Gbyte main memory. The time to service a memory request is 100,000 CCs. On
average, it takes 3.5 CCs to process a memory request. How often is data found in main memory?
Average memory access time = Hit Time + (Miss Rate x Miss Penalty)
Average memory access time = Hit TimeL1 + (Miss Rate L1 x Miss Penalty L1)
Miss PenaltyL1 = Hit TimeL2 + (Miss Rate L2 x Miss Penalty L2)
Miss PenaltyL2 = Hit TimeMain + (Miss Rate Main x Miss Penalty Main)
3.5 = 2 + 0.03 (15 + 0.05 (200 + X (100,000)))
3.5 = 2 + 0.03 (15 + 10 + 5000X)
3.5 = 2 + 0.03 (25 + 5000X)
3.5 = 2 + 0.75 + 150X
3.5 = 2.75 + 150X
0.75 = 150X
X = .005
Thus, 99.5% of the time, we find the data we are looking for in main memory.