0% found this document useful (0 votes)
4 views

Week_4

The document discusses the evolution of parallel computing, emphasizing the shift from faster processors to wider, multicore architectures that require rethinking algorithms for parallel execution. It outlines the differences between generic multicore and many-core chips, focusing on memory hierarchies and cache designs, including private versus shared caches. The advantages of each cache type are also explored, highlighting their impact on performance and access speed.

Uploaded by

malikayan575
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
4 views

Week_4

The document discusses the evolution of parallel computing, emphasizing the shift from faster processors to wider, multicore architectures that require rethinking algorithms for parallel execution. It outlines the differences between generic multicore and many-core chips, focusing on memory hierarchies and cache designs, including private versus shared caches. The advantages of each cache type are also explored, highlighting their impact on performance and access speed.

Uploaded by

malikayan575
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 11

Parallel Computing

Landscape
(CS 526)

Muhammad Nadeem Nadir,

Department of Computer Science,


The University of Lahore,
The “New” Moore’s
Law
• Computers no longer get faster, just
wider

• You must re-think your algorithms


to be parallel !

• Data-parallel computing is most


scalable solution:
2 4 8 16
cores cores cores cores…
Generic Multicore Chip
Process Local Process Local
or Memo or Memo
ry ry

Global Memory

• Handful of processors each supporting ~1 hardware threads

• On-chip memory near processors (cache, RAM, or both)

• Shared global memory space (external DRAM)


Generic Many-core
Chip
Process Memor Process Memor
or y •• or y

Global Memory

• Many processors each supporting many hardware threads

• On-chip memory near processors (cache, RAM, or


both)

• Shared global memory space (external DRAM)


Emergence of Parallel
Architectures
– Multi-core processors:
• Processors having n computing cores
Multi-
cores
- Transistors
- Clock Speeds

- Power

- Performance
(Perf/Clock) ILP
The memory
hierarchy
• If simultaneous multithreading only:
– all caches shared

• Multi-core chips:
– L1 caches private
– L2 caches private in some architectures
and shared in others

• Memory is always shared


Multi-cores – Memory
Hierarchies
hyper-threads
• Dual-core
Intel Xeon processors

CORE1

CORE0
• Each core is L1 cache L1 cache
hyper-threaded (SMT)
L2 cache

• Private L1 caches
memory
• Shared L2 caches
Designs with private L2
caches

CORE1

CORE0

CORE1

CORE0
L1 cache L1 cache L1 cache L1 cache

L2 cache L2 cache L2 cache L2 cache

L3 cache L3 cache
memory
memory
Both L1 and L2 are private
Examples: AMD Opteron,
A design with L3 caches
AMD Athlon, Intel Pentium D
Example: Intel Itanium 2
Private vs Shared
caches?
• Advantages ???
Private vs Shared
caches
• Advantages of private:
– They are closer to core, so faster access
– Reduces contention

• Advantages of shared:
– Threads on different cores can share the same cache
data
– More cache space available if a single (or a few) high-
performance thread runs on the system

You might also like