L-4 (Cache Memory)
L-4 (Cache Memory)
Computer Architecture
2
Key Characteristics of Computer 3
Memory Systems
Characteristics of Memory
Systems
◼ Location
◼ Refers to whether memory is internal and external to the computer
◼ Internal memory is often equated with main memory
◼ Processor requires its own local memory, in the form of registers
◼ Cache is another form of internal memory
◼ External memory consists of peripheral storage devices that are accessible to
the processor via I/O controllers
◼ Capacity
◼ Memory is typically expressed in terms of bytes
◼ Unit of transfer
◼ For internal memory the unit of transfer is equal to the number of electrical
lines into and out of the memory module
5
Method of Accessing Units of Data
Physical characteristics
◼ Volatile memory
◼ Information decays naturally or is lost when electrical power is switched off
◼ Nonvolatile memory
◼ Once recorded, information remains without deterioration until deliberately changed
◼ No electrical power is needed to retain information
◼ Magnetic-surface memories - are nonvolatile
◼ Semiconductor memory - may be either volatile or nonvolatile
◼ Nonerasable memory
◼ Cannot be altered, except by destroying the storage unit
◼ Semiconductor memory of this type is known as read-only memory (ROM)
Memory Hierarchy
Locality of Reference –
(d) decreasing the frequency of access
where
Example-4.1
+ 14
Cache
Cache
Cache Read
Operation
+ 22
Cache Addresses
Virtual Memory
◼ Virtual memory
◼ Facility that allows programs to address memory from a logical
point of view, without regard to the amount of main memory
physically available
◼ When used, the address fields of machine instructions contain
virtual addresses
◼ For reads to and writes from main memory, a hardware memory
management unit (MMU) translates each virtual address into a
physical address in main memory
+ 26
Table 4.3
Cache
Sizes of
Some
Processors
aTwo values
separated by a
slash refer to
instruction and
data caches.
Mapping Function
◼ Because there are fewer cache lines than main memory
blocks, an algorithm is needed for mapping main memory
blocks into cache lines
Direct Mapping
30
Direct Mapping
Direct Mapping Cache Organization 31
Direct Mapping Cache Example 32
+ 33
Victim Cache
Mapping From
Main Memory
to Cache:
k-Way
Set
Associative
k-Way 41
Set
Associative
Cache
Organization
+ 42
◼ Number of sets = v = 2 d
Problem
◼ 2 way set associative mapping 5 bit tag 11 bit set 4 bit word
+ 46
Problem
Replacement Algorithms
◼ Once the cache has been filled, when a new block is brought
into the cache, one of the existing blocks must be replaced
◼ For direct mapping there is only one possible line for any
particular block and no choice is possible
◼ First-in-first-out (FIFO)
◼ Replace that block in the set that has been in the cache longest
◼ Easily implemented as a round-robin or circular buffer technique
If at least one write operation has been A more complex problem occurs when
performed on a word in that line of the multiple processors are attached to the
cache then main memory must be same bus and each processor has its own
updated by writing the line of cache out local cache - if a word is altered in one
to the block of memory before bringing cache it could conceivably invalidate a
in the new block word in other caches
+ 50
Write Through
and Write Back
◼ Write through
◼ Simplest technique
◼ All write operations are made to main memory as well as to the cache
◼ The main disadvantage of this technique is that it generates substantial
memory traffic and may create a bottleneck
◼ Write back
◼ Minimizes memory writes
◼ Updates are made only in the cache
◼ Portions of main memory are invalid and hence accesses by I/O
modules can be allowed only through the cache
◼ This makes for complex circuitry and a potential bottleneck
+ 51
Cache coherency
Even if a write-through policy is used, the other caches
may contain invalid data. A system that prevents this
problem is said to maintain cache coherency. Possible
approaches to cache coherency include the following:
Multilevel Caches
◼ As logic density has increased it has become possible to have a cache
on the same chip as the processor
◼ The on-chip cache reduces the processor’s external bus activity and
speeds up execution time and increases overall system performance
◼ When the requested instruction or data is found in the on-chip cache, the bus
access is eliminated
◼ On-chip cache accesses will complete appreciably faster than would even
zero-wait state bus cycles
◼ During this period the bus is free to support other transfers
◼ Two-level cache:
◼ Internal cache designated as level 1 (L1)
◼ External cache designated as level 2 (L2)
◼ Potential savings due to the use of an L2 cache depends on the hit rates
in both the L1 and L2 caches
◼ The use of multilevel caches complicates all of the design issues related
to caches, including size, replacement algorithm, and write policy
Hit Ratio (L1 & L2) 54
CD (cache disable)
NW (not write-through)
Cache
Memory
Chapter 4