0% found this document useful (0 votes)
12 views14 pages

Chapter 4 - Memory

The document outlines key concepts related to memory in embedded systems, including types of memory, write ability, storage permanence, and memory hierarchy. It discusses various memory types such as ROM, RAM, EEPROM, and Flash, detailing their characteristics and applications. Additionally, it covers memory composition techniques and the importance of cache in improving access speed.

Uploaded by

electrosgeek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views14 pages

Chapter 4 - Memory

The document outlines key concepts related to memory in embedded systems, including types of memory, write ability, storage permanence, and memory hierarchy. It discusses various memory types such as ROM, RAM, EEPROM, and Flash, detailing their characteristics and applications. Additionally, it covers memory composition techniques and the importance of cache in improving access speed.

Uploaded by

electrosgeek
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Outline

⚫ Introduction

Memory ⚫ Memory Write Ability and Storage

Permanence
Chapter 4 ⚫ Common Memory Types

⚫ Composing Memory

⚫ Memory Hierarchy and Cache

1 2

Introduction Memory: basic concepts


⚫ Embedded system’s functionality aspects ⚫ Stores large number of bits

⚫ Processing ⚫ m x n: m words of n bits each m × n memory


⚫ processors ⚫ k = Log2(m) address input signals …
⚫ transformation of data ⚫ or m = 2^k words

m words
⚫ Storage ⚫ e.g. 4,096 x 8 memory:

⚫ memory ⚫ 32,768 bits
⚫ retention of data ⚫ 12 address input signals

⚫ Communication ⚫ 8 input/output data signals


n bits per word

⚫ buses

⚫ transfer of data

3 4
Write ability/ storage
… permanence
⚫ Memory access
⚫ Traditional ROM/RAM distinctions
⚫ r/w: selects read or write
⚫ ROM: Read only, bits stored without power
⚫ enable: read or write only when asserted
⚫ RAM: Read and write, lose stored bits without power
⚫ multiport: multiple accesses to different locations
⚫ Traditional distinctions blurred
simultaneously
⚫ Advanced ROMs can be written to: EEPROM
Memory External View
r/w ⚫ Advanced RAMs can hold bits without power: NVRAM
k
Enable 2 ×n ⚫ Write ability
read and write
A0 … memory ⚫ Manner and speed a memory can be written

Ak-1 ⚫ Storage permanence



⚫ ability of memory to hold stored bits after they are
Qn-1 Q0 written
5 6

Write ability Storage permanence


⚫ Ranges of write ability ⚫ Range of storage permanence

⚫ High end ⚫ High end

⚫ processor writes to memory simply and quickly: RAM ⚫ essentially never loses bits: Mask-programmed ROM

⚫ Middle range ⚫ Middle range


⚫ processor writes to memory, but slower: FLASH, EEPROM ⚫ holds bits days, months, or years after memory’s power source turned off:
⚫ Lower range NVRAM

⚫ special equipment, “programmer”, must be used to write to ⚫ Lower range


memory: EPROM, OTP ROM
⚫ holds bits as long as power supplied to memory: SRAM
⚫ Low end
⚫ Low end
⚫ bits stored only during fabrication: Mask-programmed ROM
⚫ begins to lose bits almost immediately after written: DRAM
⚫ In-system programmable memory
⚫ Nonvolatile memory
⚫ Can be written to by a processor in the embedded
⚫ Holds bits after power is no longer supplied
system
⚫ High end and middle range of storage permanence
7 ⚫ Memories in high end and middle range of write ability 8
ROM: Read Only Memory
permanence
Storage
Mask-programmed ROM Ideal memory

Life of OTP ROM ⚫ Nonvolatile memory


product

Tens of EPROM EEPROM FLASH


⚫ Can be read from but cannot be written to
years
Battery Nonvolatile NVRAM ⚫ It can be programmed
life (10

⚫ Setting the bits within the memory


years)

In-system
⚫ Programmed offline
programmable SRAM/DRAM
Near
Write
⚫ Uses
zero
ability
During External External External External
⚫ Store software program for general-purpose processor
In-system, fast
fabrication programmer, programmer, programmer programmer
writes,
only one time only 1,000s OR in-system, OR in-system,
unlimited
⚫ Store constant data needed by system
of cycles 1,000s block-oriented
cycles
of cycles writes, 1,000s
of cycles
⚫ Implement combinational circuit

Write ability and storage permanence of memories,


9
showing relative degrees along each axis (not to scale). 10

Implementing combinational
Example: 8 x 4 ROM function
⚫ Horizontal lines = words
⚫ Any combinational circuit of n functions of
⚫ Vertical lines = data Internal view
same k variables can be done with 2^k x n
⚫ Lines connected only at
8 × 4 ROM

enable 3×8
word 0
word 1
ROMTruth table
circles decoder word 2
Inputs (address) Outputs 8×2 ROM
A0 word line
a b c y z
⚫ Decoder sets word 2’s line to 1 A1
A2 0 0 0 0 0 0 0 word 0
0 0 1 0 1 0 1 word 1
if address input is 010 data line
0 1 0 0 1
0 1
0 1 1 1 0
⚫ Data lines Q3 and Q1 are set to
programmable
connection wired-OR 1 0 0 1 0 enable 1 0
1 0 1 1 1
Q 3 Q 2 Q 1 Q0 1 1 0 1 1 1 0
1 because there is a 1 1 1 1 1
c
1 1
b
“programmed” connection a 1 1
1 1 word 7
with word 2’s line
y z
⚫ Word 2 is not connected with
11 12
OTP ROM: One-time programmable
Mask-programmed ROM ROM
⚫ Connections “programmed” at fabrication ⚫ Connections “programmed” after manufacture by user
⚫ user provides file of desired contents of ROM
⚫ set of masks
⚫ file input to machine called ROM programmer
⚫ Lowest write ability
⚫ each programmable connection is a fuse
⚫ only once ⚫ ROM programmer blows fuses where connections should not exist

⚫ Highest storage permanence ⚫ Very low write ability


⚫ typically written only once and requires ROM programmer device
⚫ bits never change unless damaged
⚫ Very high storage permanence
⚫ Typically used for final design of high-volume
⚫ bits don’t change unless reconnected to programmer and more
systems fuses blown
⚫ spread out NRE cost for a low unit cost ⚫ Commonly used in final products
⚫ cheaper, harder to inadvertently modify
13 14

EPROM: Erasable programmable


ROM …
⚫ Programmable component is a ⚫ Better write ability
MOS transistor
⚫ can be erased and reprogrammed thousands of times
0
floating gate V

⚫ Transistor has “floating” gate surrounded source drain

by an insulator ⚫ Reduced storage permanence


(a)
⚫ (a) Negative charges form a channel ⚫ program lasts about 10 years but is susceptible to radiation
between source and drain storing a logic 1
+15 and electric noise
⚫ (b) Large positive voltage at gate causes V
⚫ Typically used during design development
source drain

negative charges to move out of channel (b)


and get trapped in floating gate storing a
5-30 min
logic 0

⚫ (c) (Erase) Shining UV rays on surface of source drain

(c)
floating-gate causes negative charges to
return to channel from floating gate
(d)
restoring the logic 1

⚫ (d) An EPROM package showing quartz


.
15 16
window through which UV light can pass
EEPROM: Electrically erasable
programmable ROM …
⚫ Programmed and erased electronically ⚫ Similar storage permanence to EPROM (about 10

⚫ typically by using higher than normal voltage years)


⚫ can program and erase individual words
⚫ Far more convenient than EPROMs, but more
⚫ Better write ability
expensive
⚫ can be erased and programmed tens of thousands of times

⚫ can be in-system programmable with built-in circuit to provide

higher than normal voltage


⚫ built-in memory controller commonly used to hide details from

memory user

⚫ writes very slow due to erasing and programming

⚫ “busy” pin indicates to processor EEPROM still writing

17 18

Flash Memory RAM: Random Access Memory


⚫ Extension of EEPROM
⚫ Typically volatile memory
⚫ Same floating gate principle, write ability and storage permanence
⚫ bits are not held without power supply
⚫ Fast erase
⚫ Read and written to easily during program execution
⚫ Large blocks of memory erased at once, rather than one word at a

time ⚫ Internal structure more complex than ROM

⚫ Blocks typically several thousand bytes large External View


⚫ Writes to single words may be slower r/w
enable 2k × n
⚫ Entire block must be read, word updated, then entire block written read and write
back A0 …
memory

⚫ Used with embedded systems storing large data items in Ak-1



nonvolatile memory
19 ⚫ e.g., digital cameras, TV set-top boxes, cell phones 20 Qn-1 Q0
… Basic types of RAM
⚫ a
⚫ SRAM: Static RAM
word consists of several
memory cells, each storing 1 Internal view
bit
I3 I2 I1 I0 ⚫ Memory cell uses flip-flop to store bit

⚫ each input and output data 4×4 RAM ⚫ Requires 6 transistors

line connects to each cell in its Enable ⚫ Holds data as long as power supplied
2×4
column
A0 decoder

⚫ rd/wr connected to every cell A1 SRAM


Memory

⚫ when
cell
row is enabled by rd/wr To every cell
decoder, each cell has logic Data' Data
Q3 Q2 Q1 Q0
that stores input data bit when
rd/wr indicates write or W
outputs stored bit when rd/wr
21 22
indicates read

… Ram variations
⚫ DRAM: Dynamic RAM
⚫ PSRAM: Pseudo-static RAM
⚫ Memory cell uses MOS transistor and capacitor to
⚫ DRAM with built-in memory refresh controller
store bit
⚫ Popular low-cost high-density alternative to SRAM
⚫ More compact than SRAM
⚫ NVRAM: Nonvolatile RAM
DRAM
⚫ “Refresh” required due to capacitor leak
⚫ Holds data after external power removed
Data
⚫ word’s cells refreshed when read
W
⚫ Battery-backed RAM
⚫ Typical refresh rate 15.625 microsec.
⚫ SRAM with EEPROM or flash
⚫ Slower to access than SRAM

23 24
Example:HM6264 & 27C256
… RAM/ROM devices
⚫ Battery-backed RAM ⚫ Low-cost,
11-13, 15-19 data<7…0> data<7…0>
11-13, 15-19

⚫ SRAM with own permanently connected battery low-capacity 2,23,21,24,


25, 3-10
addr<15...0> 27,26,2,23,21,
24,25, 3-10
addr<15...0>

22 /OE 22 /OE

⚫ writes as fast as reads memory devices 27 /WE 20 /CS

⚫ Commonly used in
20 /CS1

⚫ no limit on number of writes unlike nonvolatile 26 CS2


HM6264
block diagrams
27C256

ROM-based memory 8-bit Device Access Time (ns) Standby Pwr. (mW) Active Pwr. (mW) Vcc Voltage (V)
HM6264 85-100 .01 15 5
27C256 90 .5 100 5

⚫ SRAM with EEPROM or flash


microcontroller-base device characteristics

d embedded systems Read operation Write operation

⚫ stores complete RAM contents on EEPROM or flash data data

⚫ First two numeric


addr addr

before power turned off OE WE


/CS1 /CS1
digits indicate device CS2 CS2
timing diagrams

type
25 26 ⚫ RAM: 62, ROM: 27

⚫ Subsequent digits

indicate capacity in

Composing memory …Kbit


⚫ Memory size needed often differs from size of readily
⚫ Increase width of words
available memories

⚫ When available memory is larger, simply ignore unneeded

high-order address bits and higher data lines 2m × 3n ROM


⚫ When available memory is smaller, compose several Enable 2m × n ROM 2m × n ROM 2m × n ROM
smaller memories into one larger memory
A0
⚫ Connect side-by-side to increase width of words … … …

Am
⚫ Connect top to bottom to increase number of words … …

⚫ added high-order address line selects smaller memory containing


desired word using a decoder Q3n-1 Q2n-1 Q0
⚫ Combine techniques to increase number and width of words

27 28
… …
⚫ Increase number of words ⚫ Increase number and width of words

2m+1 × n ROM
2m × n ROM
A
A0
… …
Am-1
1×2

Am decoder

2m × n ROM

Enable


enable

… outputs
29 Qn- Q0 30
1

Memory hierarchy Cache


⚫ Usually designed with SRAM

⚫ faster but more expensive than DRAM


Registers
⚫ Usually on same chip as processor
Cache
⚫ space limited, so much smaller than off-chip main memory

Main memory ⚫ faster access ( 1 cycle vs. several cycles for main memory)

Disk

Tape

31 32
… Cache mapping
⚫ Cache operation: ⚫ Far fewer number of available cache addresses
⚫ Request for main memory access (read or write)
⚫ Are address’ contents in cache?
⚫ First, check cache for copy
⚫ Cache mapping used to assign main memory
⚫ cache hit

⚫ copy is in cache, quick access address to cache address and determine hit or
⚫ cache miss
miss
⚫ copy not in cache, read address and possibly its neighbors into cache

⚫ Several cache design choices


⚫ Caches partitioned into indivisible blocks or

⚫ cache mapping, replacement policies, and write techniques lines of adjacent memory addresses
⚫ usually 4 or 8 addresses per line

33 34

Direct mapping Fully associative mapping


⚫ Main memory address divided into 2 fields
⚫ Complete main memory address stored in each cache
⚫ Index: Cache Address address
Tag Index Offset

⚫ All addresses stored in cache simultaneously compared
number of bits determined by cache size
V T D
⚫ Tag
with desired address
⚫ compared with tag stored in cache at
Data ⚫ Valid bit and offset same as direct mapping
address indicated by index Tag Offset
Valid
⚫ if tags match, check valid bit
Data
= V T D V T D V T D

⚫ Valid bit …

⚫ indicates whether data in slot has been loaded from


Valid
= =
=
memory

⚫ Offset
35 36
⚫ used to find particular word in cache line
Set-associative mapping …
⚫ Compromise between direct mapping and fully
associative mapping Tag Index Offset

⚫ Index same as in direct mapping V T D V T D


Data
⚫ But, each cache address contains content and tags of 2

or more memory address locations Valid


⚫ Tags of that set simultaneously compared as in fully
= =
associative mapping

⚫ Size of N called N-way set-associative

⚫ 2-way, 4-way, 8-way are common

37 38

Cache-replacement policy …
⚫ Technique for choosing which block to replace ⚫ FIFO: first-in-first-out

⚫ when fully associative cache is full ⚫ push block onto queue when accessed

⚫ when set-associative cache’s line is full ⚫ choose block to replace by popping queue

⚫ Direct mapped cache has no choice ⚫ Least Frequently Used

⚫ Random ⚫ Replace the block which is accessed less number of

times
⚫ replace block chosen at random

⚫ LRU: least-recently used

⚫ replace block not accessed for longest time

39 40
Cache impact on system
Cache write techniques performance
⚫ Write-through ⚫ Most important parameters in terms of
⚫ write to main memory whenever cache is written to
performance:
⚫ easiest to implement
⚫ Total size of cache
⚫ processor must wait for slower main memory write
⚫ total number of data bytes cache can hold
⚫ potential for unnecessary writes
⚫ tag, valid and other house keeping bits not included in total
⚫ Write-back
⚫ Degree of associativity
⚫ main memory only written when “dirty” block replaced
⚫ Data block size
⚫ extra dirty bit for each block set when cache block written to

⚫ reduces number of slow main memory writes

41 42

… Cache performance trade-offs


⚫ Larger caches achieve lower miss rates but higher access
⚫ Improving cache hit rate without increasing
cost
size
⚫ e.g.,
⚫ Increase line size
⚫ 2 Kbyte cache: miss rate = 15%, hit cost = 2 cycles, miss cost = 20

cycles ⚫ Change set-associativity


0.1
⚫ avg. cost of memory access = (0.85 * 2) + (0.15 * 20) = 4.7 cycles 6
0.1
4
⚫ 4 Kbyte cache: miss rate = 6.5%, hit cost = 3 cycles, miss cost will 0.1
2
0. 1 way
not change % cache miss
1 2 way
0.0
4 way

8
avg. cost of memory access = (0.935 * 3) + (0.065 * 20) = 4.105 cycles 0.0 8 way
6
(improvement) 0.0
4
0.0

⚫ 8 Kbyte cache: miss rate = 5.565%, hit cost = 4 cycles, miss cost
2
0
cache size
1 Kb 2 Kb 4 Kb 8 Kb 16 Kb 32 Kb 64 Kb 128 Kb
will not change
43 ⚫ avg. cost of memory access = (0.94435 * 4) + (0.05565 * 20) = 4.8904 44
Advanced RAM Basic DRAM
⚫ DRAMs commonly used as main memory in ⚫ Address bus multiplexed between row and column

components
processor based embedded systems
⚫ Row and column addresses are latched in, sequentially,
⚫ high capacity, low cost
by strobing ras and cas signals, respectively
⚫ Many variations of DRAMs proposed
⚫ Refresh circuitry can be external or internal to DRAM
⚫ FPM DRAM: Fast Page Mode DRAM
device
⚫ EDO DRAM: Extended Data Out DRAM
⚫ strobes consecutive memory address periodically causing
⚫ SDRAM/ESDRAM: Synchronous and Enhanced memory content to be refreshed
Synchronous DRAM ⚫ Refresh circuitry disabled during read or write operation

⚫ RDRAM: Rambus DRAM

45 46

Fast Page Mode DRAM (FPM


… DRAM)
⚫ Each row of memory bit array is viewed as a page
data Refresh
Circuit ⚫ Page contains multiple words

⚫ Individual words addressed by column address


Col Addr . Buffer
Data In Buffer

Sense
⚫ Timing diagram:
Amplifiers
Col Decoder
rd/wr cas
⚫ row (page) address sent
cas, ras, clock

⚫ 3 words read consecutively by sending column address for


Data Out Buffer

Row Addr. Buffer

Row Decoder

each
ras
address ⚫ Extra cycle eliminated
ras on each read/write of words
Bit storage array
from same pagecas
address row col col col

47 48 data data data data


Extended data out DRAM (EDO (S)ynchronous and
DRAM) Enhanced Synchronous (ES) DRAM
⚫ SDRAM latches data on active edge of clock
⚫ Improvement of FPM DRAM
⚫ Eliminates time to detect ras/cas and rd/wr signals
⚫ Extra latch before output buffer ⚫ A counter is initialized to column address then incremented on

⚫ allows strobing of cas before data read operation completed active edge of clock to access consecutive memory locations
⚫ ESDRAM improves SDRAM
⚫ Reduces read/write latency by additional cycle
⚫ added buffers enable overlapping of column addressing
⚫ faster clocking and lower read/write latency possible
ras

cas

address row col col col clock


data data data data
ras

Speedup through overlap cas

address row col

data data data data

49 50

Rambus DRAM (RDRAM) DRAM integration problem


⚫ More of a bus interface architecture than ⚫ SRAM easily integrated on same chip as processor

DRAM architecture ⚫ DRAM more difficult

⚫ Different chip making process between DRAM and


⚫ Data is latched on both rising and falling conventional logic

edge of clock ⚫ Goal of conventional logic (IC) designers:

⚫ minimize parasitic capacitance to reduce signal


⚫ Broken into 4 banks each with own row
propagation delays and power consumption
decoder ⚫ Goal of DRAM designers:

⚫ create capacitor cells to retain stored information


⚫ can have 4 pages open at a time
⚫ Integration processes beginning to appear
⚫ Capable of very high throughput
51 52
Memory Management Unit (MMU)
⚫ Duties of MMU

⚫ Handles DRAM refresh, bus interface and

arbitration
⚫ Takes care of memory sharing among multiple

processors
⚫ Translates logical memory addresses from processor

to physical memory addresses of DRAM

⚫ Modern CPUs often come with MMU built-in

⚫ Single-purpose processors can be used

53

You might also like