Module IV
Memory Organization
Memory Cell Operation
SRAM vs DRAM
SRAM DRAM
• Less memory cells per unit • More memory cells per unit
area area
• Less Access time • More Access time
• Uses Flipflops • Uses Capacitors
• Refreshing Circuitry is not • Refreshing Circuitry is
required required
• Costly • Less Costly
• Used for cache memory • Used for main memory
ROM
• It contains a permanent pattern of data that cannot
be changed.
• It is non-volatile (no power is required)
• It is possible to only read from ROM
• Application of ROMs :
– Microprogramming
– Library subroutines
– System programs
– Function tables
• Adv: Data or program is permanent in main memory
Types of ROM
• ROM - Read Only Memory.
• PROM - Programmable Read Only Memory.
• EPROM - Erasable Programmable Read Only
Memory.
• EEPROM - Electrically Erasable Programmable
Read Only Memory.
• Flash EEPROM
ROM
• ROM is created with data actually wired into
chip during fabrication.
• This presents two problems:
– Data insertion includes a large fixed cost
– There is no room or error. If one bit is wrong, the
whole batch of ROMs must be thrown out.
PROM
• It is non-volatile and wrote only once.
• The writing process is performed electrically
and may be performed later than during
fabrication.
• Special equipment is required for the writing
or “programming” process.
• It provide flexibility and convenience.
EPROM
• It is read and written electrically.
• Before a write operation, all storage cells are erased
• Erasure is performed by shining an intense ultraviolet
light through a window designed into the chip
• This can be performed repeatedly
• Each erasure can take upto 20 minutes
• EPROM can be altered multiple times
• EPROM is more expensive than PROM
• Adv: multiple update capability.
EEPROM
• It can be written into at any time without erasing prior
contents.
• Write operation takes longer time than read.
• Adv:
– nonvolatility
– flexibility of being updatable
– Uses ordinary bus control, address, and data lines.
• Disadv:
– More expensive
– Less dense (fewer bits per chip)
Flash EEPROM
• Named so since it can be reprogrammed fast.
• It uses electrical erasing technology
• Entire memory can be erased in few seconds
• It is possible to erase blocks of memory rather
than entire chip.
• It does not provide byte-level erasure.
• Adv: It uses only one transistor per bit and
hence high density
Memory Hierarchy
• To implement memory systems, the following
relationships hold:
• Faster access time, greater cost per bit
• Greater capacity, smaller cost per bit
• Greater capacity, slower access time
• Dilemma: Designer would prefer large-capacity
memory but to improve performance he needs to
use faster low capacity memories.
• Way out : Do not rely on a single memory type but
to employ a memory hierarchy
Memory Hierarchy
Memory Hierarchy
• As one goes down the hierarchy:
a. Decreasing cost per bit
b. Increasing capacity
c. Increasing access time
d. Decreasing frequency of access of the memory by
the processor
• Thus, smaller, more expensive, faster
memories are supplemented by larger,
cheaper, slower memories.
Virtual Memory
• It allows the execution of processes that are not
completely in memory
• It abstracts main memory into an extremely large
storage, separating logical memory from physical
memory.
• It frees programmers from the concerns of memory
limitations.
• It allows processes to share files easily.
• It is not easy to implement and decrease
performance if it is used carelessly.
Virtual Memory
• Virtual memory involves the separation of
logical memory as perceived by users from
physical memory.
• This separation allows an extremely large
virtual memory for programmers when only a
smaller physical memory is available
Virtual Memory
Cache
• Small amount of fast memory
• Sits between normal main memory and CPU
• May be located on CPU chip or module
Cache and Main Memory
Cache/Main Memory Structure
Cache operation – overview
• CPU requests contents of memory location
• Check cache for this data
• If present, get from cache (fast)
• If not present, read required block from main
memory to cache
• Then deliver from cache to CPU
• Cache includes tags to identify which block of
main memory is in each cache slot
Cache Read Operation - Flowchart
Cache Design
• Addressing
• Size
• Mapping Function
• Replacement Algorithm
• Write Policy
• Block Size
• Number of Caches
Cache Addressing
• Where does cache sit?
– Between processor and virtual memory management unit
– Between MMU and main memory
• Logical cache (virtual cache) stores data using virtual
addresses
– Processor accesses cache directly, not thorough physical cache
– Cache access faster, before MMU address translation
– Virtual addresses use same address space for different applications
• Must flush cache on each context switch
• Physical cache stores data using main memory physical
addresses
Size does matter
• Cost
– More cache is expensive
• Speed
– More cache is faster (up to a point)
– Checking cache for data takes time
Typical Cache Organization
Comparison of Cache Sizes
Year of
Processor Type L1 cache L2 cache L3 cache
Introduction
IBM 360/85 Mainframe 1968 16 to 32 KB — —
PDP-11/70 Minicomputer 1975 1 KB — —
VAX 11/780 Minicomputer 1978 16 KB — —
IBM 3033 Mainframe 1978 64 KB — —
IBM 3090 Mainframe 1985 128 to 256 KB — —
Intel 80486 PC 1989 8 KB — —
Pentium PC 1993 8 KB/8 KB 256 to 512 KB —
PowerPC 601 PC 1993 32 KB — —
PowerPC 620 PC 1996 32 KB/32 KB — —
PowerPC G4 PC/server 1999 32 KB/32 KB 256 KB to 1 MB 2 MB
IBM S/390 G4 Mainframe 1997 32 KB 256 KB 2 MB
IBM S/390 G6 Mainframe 1999 256 KB 8 MB —
Pentium 4 PC/server 2000 8 KB/8 KB 256 KB —
High-end server/
IBM SP 2000 64 KB/32 KB 8 MB —
supercomputer
CRAY MTAb Supercomputer 2000 8 KB 2 MB —
Itanium PC/server 2001 16 KB/16 KB 96 KB 4 MB
SGI Origin 2001 High-end server 2001 32 KB/32 KB 4 MB —
Itanium 2 PC/server 2002 32 KB 256 KB 6 MB
IBM POWER5 High-end server 2003 64 KB 1.9 MB 36 MB
CRAY XD-1 Supercomputer 2004 64 KB/64 KB 1MB —
Mapping Function
• Cache of 64kByte
• Cache block of 4 bytes
– i.e. cache is 16k (214) lines of 4 bytes
• 16MBytes main memory
• 24 bit address
– (224=16M)
Direct Mapping
• Each block of main memory maps to only one
cache line
– i.e. if a block is in cache, it must be in one specific place
• Address is in two parts
• Least Significant w bits identify unique word
• Most Significant s bits specify one memory block
• The MSBs are split into a cache line field r and a
tag of s-r (most significant)
Direct Mapping
Address Structure
Tag s-r Line or Slot r Word w
8 14 2
• 24 bit address
• 2 bit word identifier (4 byte block)
• 22 bit block identifier
– 8 bit tag (=22-14)
– 14 bit slot or line
• No two blocks in the same line have the same Tag field
• Check contents of cache by finding line and checking Tag
Direct Mapping from Cache to Main Memory
Direct Mapping
Cache Line Table
Cache line Main Memory blocks held
0 0, m, 2m, 3m…2s-m
1 1,m+1, 2m+1…2s-m+1
…
m-1 m-1, 2m-1,3m-1…2s-1
Direct Mapping Cache Organization
Direct
Mapping
Example
Direct Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w
= 2s
• Number of lines in cache = m = 2r
• Size of tag = (s – r) bits
Associative Mapping
• A main memory block can load into any line of
cache
• Memory address is interpreted as tag and
word
• Tag uniquely identifies block of memory
• Every line’s tag is examined for a match
• Cache searching gets expensive
Associative Mapping from
Cache to Main Memory
Associative Mapping
Address Structure
Word
Tag 22 bit 2 bit
• 22 bit tag stored with each 32 bit block of data
• Compare tag field with tag entry in cache to check for hit
• Least significant 2 bits of address identify which 16 bit word is
required from 32 bit data block
• e.g.
– Address Tag Data Cache line
– FFFFFC FFFFFC 24682468 3FFF
Associative Mapping Summary
• Address length = (s + w) bits
• Number of addressable units = 2s+w words or
bytes
• Block size = line size = 2w words or bytes
• Number of blocks in main memory = 2s+ w/2w
= 2s
• Number of lines in cache = undetermined
• Size of tag = s bits
Interleaved Memory
• Collection of DRAM chips
• Grouped into memory bank
• Banks independently service read or write
requests
• K banks can service k requests simultaneously