pdc1: MODULE 1: PARALLELISM FUNDAMENTALS
pdc1: MODULE 1: PARALLELISM FUNDAMENTALS
processing
• Pipelined and super
scalar processors
SCOPE, VIT Chennai
Single instruction stream,
multiple data streams (SIMD)
INSTRUCTION STREAM
DATA Processing
STREAM1 Unit
DATA Processing
DATA
STREAM 2 Unit
DATA Processing
Unit
STREAM 3
Processing
Unit
DATA Processing
STREAM Unit
Processing
Unit
DATA Processing
STREAM1 Unit
DATA Processing
DATA
STREAM 2 Unit
DATA Processing
Unit
STREAM 3
Time
W1 1 2 3 4
L2 L2
L3 L3
P P P P
L1D L1D L1D L1D
L2 L2
• Intel
Dunnigton
SCOPE, VIT Chennai
Quad Core Processor
• Separate L1 and L2
caches
• Shared L3 cache P P P P
• L3 group is whole chip
• Built-in memory L1D L1D L1D L1D
interface allows to L2 L2 L2 L2
attach memory and
L3
other sockets without HT/
chipset QPI
Memory Interface
• AMD Shangai and Intel
Nehalem
SCOPE, VIT Chennai
Shared-memory
• A system where the number of CPUs
work on the physical address space
• Two varieties:
– Uniform Memory Access (UMA)
– Cache Coherent Non-Uniform Memory
Access (ccNUMA)
L1D L1D
L2 L2
Chipset
Memory
L2 L2
Chipset
Memory
• I: invalid
A1 A2 A1 A2
A1 A2
Memory
P P P P P P P P
L1D L1D L1D L1D L1D L1D L1D L1D
L2 L2 L2 L2 L2 L2 L2 L2
L3 L3
Coherent
Memory Interface Memory Interface
Link
Memory Memory
M M M M
NI NI NI NI
Communication Network