Embedded Systems Notes
Embedded Systems Notes
Difinitions
Write Buffer
Cache Policies
address data
cache
controller
cache
main
CPU
memory
address
data data
Cache controller uses different portions of the address issued by the processor during a memory request
to select parts of Cache memory
DEFINITIONS
• Working Set: Set of memory locations CPU refers to at any one time
• Cache Hit: When an address requested by CPU is found in Cache
• Cache Miss: If the location is not in Cache
• Compulsory Miss ( Cold miss): It occurs the first time a location is used
• Capacity Miss: is caused by a too large working set which can not be accommodated in cache
• Conflict Miss: when two locations map to the same location in the cache
Cache Operation
A miss causes cache controller to copy the data from main memory to cache
Data is forwarded to the CPU at the same time: data streaming
Data occupying cache block is evicted and replaced by the content of memory addresses requested by the CPU
Take care of the data contained by the cache has been modified or not, if yes store the data in main memory
Dirty bit: a status bit to indicate whether cache content has changed
Cache Mechanism
1. Direct Mapped Cache
In a direct-mapped cache, each memory address is associated with one possible block within the cache
Controller look in a single location in the cache for the data if it exists in the cache
Whole transfer takes place in a Block
Controller looks for a Block at a fixed location in cache
DIRECT MAPPED CACHE
Cache 4 Byte Direct
Index Mapped Cache
Memory
0 0
1 1
2 2
3 3
4
5 • Block size = 1 byte
6
7 • Cache Location 0 can be
8 occupied by data from:
9
A • Memory location 0, 4, 8, ..
B • In general: a fixed mapping
C
D from memory locations to
E Cache
F
DIRECT MAPPED CACHE
hit data
2 WAY SET ASSOCIATIVE CACHE ORGANIZATION
CONTENT ADDRESSABLE MEMORY
• CAM : Content addressable memory
• Uses a set of comparators to compare input tag address with a cache-tag stored in each valid cache block
• CAM produces an address if a given data value exists in memory
• CAM enables many more cache-tags to be compared simultaneously
address
mux
hit data
Cache in ARM
Von Neumann Architecture
Unified Cache: A single cache for instruction and data
Harvard Architecture:
Two caches: Split Cache
Instruction Cache & Data Cache
WRITE BUFFER
• A small fast FIFO that temporarily holds data that processor would write to main memory
• Buffer is emptied to slower main memory
• Reduces time for the processor to write small blocks of sequential data to main memory
• Write buffers improve cache performance
• Cache controller, during block eviction, writes dirty block to write buffer instead of main memory
Word/byte
access
Block transfer
cache main
CPU Write
memory
buffer
Some write buffers are not strictly FIFO; ARM 10 family supports coalescing
CACHE POLICIES
Write Policy
Writethrough
Cache controller writes to both cache and memory
Writeback
Controller writes only to cache and sets dirty bit true
Better cache performance at the risk of data inconsistency for a longer period
Block replacement Policy
Round-robin or cyclic replacement: block loaded currently will not be replaced,
We know the sequence in which the blocks is to be replaced.
It has very bad worst case performance
predictable performance
Random Selection : improving worst case behavior
Allocation Policy
Read allocate: allocation takes place only at read
Read-write allocate: allocation takes place on either Core Write Policy Replacement Allocation
read or write Policy Policy
These policies are implemented in hardware ARM720T writethrough random Read-miss
ARM CACHED CORE POLICY ARM740T writethrough random Read-miss
ARM920T Writethrough, Random, Read-miss
writeback round-robin
ARM946E Writethrough, Random, Read-miss
writeback round-robin
CACHE CONTROL IN ARM
• All standard have got memory facilities which are controlled by System Control Coprocessor (CP 15)
• The job of system control coprocessor is to manage standard memory facility including cache.
• CP 15 has registers (programmable) using which features and control functions of Cache are specified
• Size of Cache and degree of associativity
• Enabling/disabling of cache operations
• Policy choices: write, replacement, allocation
• Memory areas can be indicated as Cachable or not
• Memory mapped I/O locations are not cachable or bufferable (use of write buffer not possible)
Cache Lockdown
• Cache Lockdown allows critical code and data to be loaded into cache in such a way that corresponding
cache blocks are not re-allocated
• high-priority interrupt routines and data that they access
• For lockdown purpose cache is divided into lockdown blocks in ARM
• One block from each cache set can be marked as locked block (data from the main memory can be loaded
in to this block)
Multilevel Cache
• To minimize cache miss-rate due to limitations in capacity
• System Designers add, may be, level 2 (L2) cache
• L1 cache will have normally single cycle access and L2 cache will have latency of more than one
CPU cycle but less than that of main system memory
HIERARCHICAL MEMORY ARCHITECTURE
Optimizes data transfer rate and energy consumption
Energy requirement for DRAM access is more than 10
times that of L2 cache access.