0% found this document useful (0 votes)

80 views23 pages

PP16 Lec4 Arch3

This document discusses parallel processing architectures and platforms. It covers: - Explicitly parallel processor architectures including SIMD and MIMD systems. - Memory configurations including shared memory, distributed memory, and the differences between physical and logical memory. - Inter-processor communication methods including shared memory, message passing, and different interconnect technologies. - Programming models like SPMD and MPMD and how they apply to different architectures. - Examples of parallel platforms including SMP, clusters, and vector/array processors.

Uploaded by

RohFollower

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

80 views23 pages

PP16 Lec4 Arch3

Uploaded by

RohFollower

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 23

Parallel Processing

sp2016
lec#4
Dr M Shamim Baig

1.1

Explicitly Parallel Processor

architectures:
Task-level Parallelism

1.2

Elements of (Explicit) Parallel

Architectures
Processor configurations:
Instruction/Data Stream based
Memory Configurations:
- Physical & Logical based
- Access-Delay based
Inter-processor communication:
Communication-Interface design
- Data Exchange/ Synch approach
1.3

Example SIMD & MIMD systems

Variants of SIMD have found use in coprocessing units such as the MMX units
in Intel processors, DSP chips such as
the Sharc & Vividias graphic processors
GPUs
Examples of MIMD-platforms include
current generation Sun Ultra Servers,
SGI Origin Servers, multiprocessor PCs,
workstation clusters & IBM SP.

1.4

Ex: Conditional Execution in SIMD Processors

It is often necessary to selectively turn off operations on certain data items. For this, most SIMD programming
paradigms allow for ``activity mask'', which determines if a processor should participate in a computation or not

Executing a conditional statement on an SIMD computer with four processors:

(a) the conditional statement; (b) the execution of the statement in two steps.

1.5

Programing Models: MPMD/ SPMD

There are two programming models for PP called
Multiple Program Multiple-Data (MPMD) execute
different program on different processors. Single
Program Multiple-Data (MPMD/ SPMD) execute same
program on different processors
As SIMD system can execute only one program which
works on different parts of data. MIMD system can
execute same /different programs which also work on
different parts of data.
Hence SIMD supports only SPMD prog-model.
Although MIMD supports both models of programming
(MPMD & SPMD), SPMD is preferred choice due to
software management
1.6

Comparison: SIMD vs MIMD

Control flow: Synchronous in SIMD vs Asynchronous in MIMD
Programming-model:SIMD supports only SPMD prog-model
while MIMD supports both (SPMD & MPMD) prog-models
Cost: SIMD computers require less hardware than
MIMD computers (single control unit).
However, since SIMD processors are specially
designed, they tend to be expensive & have long
design cycles.
In contrast, MIMD processors can be built from
inexpensive off-the-shelf components with relatively
little effort in a short time
Flexibility: SIMD perform very well for specialized /
regular structured applications (eg image proc) but not
for all applications, while MIMD are more flexible &
general purpose.
1.7

Elements of (Explicit) Parallel

Parallel Platforms:
Memory (Physical vs Logical) Configurations
Physical vs Logical Memory Config
Physical Memory config (SM, DM, CSM)
Logical Address Space config (SAS, NSAS)
Combinations
CSM + SAS (SMP; UMA)
DM + SAS (DSM; NUMA)
DM + NSAD (Multicomputer/Clusters)

1.9

Shared memory (SM) Multiprocessor

It is important to note difference between
terms Shared Memory & Shared Address
Space
Former is physical memory config, while later
is Logical memory address view for program.
It is possible to provide Shared Address
Space using a physically distributed memory.
SM-multiprocessors systems are SAS-based
using physical memory configuration
either as CSM or as (DM DSM)

1.10

UMA vs NUMA
SM-multiprocessors are further categorized based on
memory access delay as UMA (uniform memory
access) & NUMA (non uniform memory access)
UMA system is based on (CSM + SAS) config,
where each processor has same delay for
accessing any memory location
NUMA system is based on (DM+SAS = DSM)
config, where a processor may have different
delay for accessing different memory location.

1.11

UMA & NUMA Arch Block Diagrams

Both are SMmultiprocessors

differing in
Memory Access
Delay format

UMA (CSM+ SAS)

NUMA (DM+ SAS= DSM)

Typical shared-address-space architectures: (a) Uniform-memory access shared-address-space

computer; (b) Uniform-memory-access shared-address-space computer with caches and memories;
(c) Non-uniform-memory-access shared-address-space computer with local memory only.
1.12

Simplistic view of a small shared memory

Symmetric Multi Processor (SMP):
(CSM + SAS + Bus)
Processors

Shared memory

Bus

Examples:
Dual Pentiums
Quad Pentiums
1.13

Quad Pentium Shared

Memory SMP
Processor

Processor

L1 cache

L2 Cache

Bus interface

Processor/
memory
bus
I/O interface

Memory controller

I/O bus

Shared memory

Memory
1.14

Multicomputer (Cluster) Platform

Complete computers P (CU + PE), DM with NSAS &
interconnection network interface at I/O bus level.
Interconnection
network
Messages
Processor

Local
memory
Computers

These platforms comprise of a set of processors

and their own (exclusive/ distributed) memory
Instances of such a view come naturally from
non-shared-address space (NSAS)
multicomputers e.g clustered workstations

1.15

Data Exchange/Synch Approaches:

Shared data vs Message-Passing
There are two primary approaches of
data exchange/synch in parallel systems
Shared-data approach
Message-Passing approach
SM-multiprocessors use Shared-Data
approach for data exchange/synch.
Multicomputers (Clusters) use MessagePassing approach for data exchange/
synch.
1.16

DataExchange/Synch Platforms:
Shared-memory vs Message-Passing
Shared memory platforms have low comm
overhead, can support lower grain levels,
while message passing platforms have more
comm overhead & therefore are more suited
for coarse grain levels
SM Multiprocessors are faster, but have poor
scalability
Message passing Multicomputer platforms
are slower but have higher scalability.
1.17

Clusters as a Computing Platform

Clusters: A network of computers became a very
attractive alternative to expensive supercomputers
used for high-performance computing in early 1990s
Several early projects Notably: Berkeley NOW (network of workstations) project.
NASA Beowulf project.

1.18

Advantages of Cluster Computer:

(NOW-like)
Very high performance workstations and
PCs readily available at low cost.
Latest processors can easily be incorporated
into the system as they become available.
Easily scalable
Existing software can be used or easily
modified.

1.19

Beowulf Clusters*
A group of interconnected commodity computers
achieving high performance with low cost.
Typically using commodity interconnects e.g
high speed Ethernet & OS e.g Linux.
* Beowulf comes from name given by NASA Goddard
Space Flight Center cluster project.

1.20

Cluster Interconnects: LAN vs SAN

LANs : fast / Gbits/ 10-Gbits Ethernet
SANs: Myrinet, Quadrics, Infiniband

Comparison LAN vs SAN

Distance: LAN for longer distance few (km vs m),
causing more delay/slower
Reliability: LAN for less reliable networks, so includes
overhead (error correction etc) which adds to delays
Processing Speed: LAN uses OS calls, causing more
processing delays
1.21

Vector/ Array Data Processors

Vector proc:1D-Temporal parallelism using
pipeline Arith unit & Vector chaining
Float add pipe: Comp exp, algn mant, add mant, Normalize

Array proc:1D- Spatial parallelism using

ALU-array as SIMD
Systolic Array: combines 2-D spatial
parallelism with pipelined (computational
wavefront
Block Diagrams of Vector/array & Systolic processing
?????
1.22

Summary: Parallel Platforms;

Memory & Interconnect Configurations
Memory Config (Physical vs Logical)
Physical Memory config (SM, DM, CSM)
Logical Address Space config (SAS, NSAS)
Combinations
CSM + SAS (SMP; UMA)
DM + SAS (DSM; NUMA)
DM + NSAD (Multicomputer/Clusters)

Interconnection Network:
o Interface level: memory bus (using MBEU) in SMmultiprocessors (UMA, NUMA) vs I/O bus (using NIU)
in multicomputer / cluster
o Data Exchange / sync:
Shared Data model vs Message Passing model
1.23

Setup and Download For The Nikon DTM-310 Total Station
0% (2)
Setup and Download For The Nikon DTM-310 Total Station
2 pages
CH17-COA10e - Parallel Processing
No ratings yet
CH17-COA10e - Parallel Processing
45 pages
Parallel Processing: sp2016 Lec#5
No ratings yet
Parallel Processing: sp2016 Lec#5
27 pages
Parallel Computing Platforms and Memory System Performance: John Mellor-Crummey
No ratings yet
Parallel Computing Platforms and Memory System Performance: John Mellor-Crummey
43 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
Slide02 Parallel Computers
No ratings yet
Slide02 Parallel Computers
44 pages
Explicitly Parallel Platforms
No ratings yet
Explicitly Parallel Platforms
90 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
No ratings yet
CS 213: Parallel Processing Architectures: Laxmi Narayan Bhuyan
26 pages
5 4 Parallel
No ratings yet
5 4 Parallel
47 pages
U1-Theory of Parallelism
No ratings yet
U1-Theory of Parallelism
43 pages
02 Lecture Flynn IN
No ratings yet
02 Lecture Flynn IN
78 pages
Chapter 6 Parallel and Concurrent Computing
No ratings yet
Chapter 6 Parallel and Concurrent Computing
27 pages
William Stallings Computer Organization and Architecture: Parallel Processing
No ratings yet
William Stallings Computer Organization and Architecture: Parallel Processing
40 pages
Unit 1 - Part - 2
No ratings yet
Unit 1 - Part - 2
30 pages
Lecture 3 - 1 Dichotomy of Parallel Computing Platforms
No ratings yet
Lecture 3 - 1 Dichotomy of Parallel Computing Platforms
17 pages
Parallel Processing
No ratings yet
Parallel Processing
22 pages
unit 4
No ratings yet
unit 4
16 pages
Memory in Multiprocessor System
No ratings yet
Memory in Multiprocessor System
52 pages
28 MIMD Architecture
No ratings yet
28 MIMD Architecture
28 pages
Flynn's Taxonomy: 1. Sisd
No ratings yet
Flynn's Taxonomy: 1. Sisd
7 pages
Flynn's Classification
No ratings yet
Flynn's Classification
46 pages
SISd
No ratings yet
SISd
17 pages
NOTES
No ratings yet
NOTES
19 pages
Classification of Parallel Computation
No ratings yet
Classification of Parallel Computation
33 pages
Baker CHPT 5 SIMD Good
No ratings yet
Baker CHPT 5 SIMD Good
94 pages
Slides Taken From: Parallel Computing Platforms
No ratings yet
Slides Taken From: Parallel Computing Platforms
11 pages
Flynn's Taxonomy and SISD SIMD MISD MIMD
86% (14)
Flynn's Taxonomy and SISD SIMD MISD MIMD
7 pages
15 Parallel Processing
No ratings yet
15 Parallel Processing
36 pages
COA U5 PPT Full
No ratings yet
COA U5 PPT Full
43 pages
CH5 Parallel Processing
No ratings yet
CH5 Parallel Processing
30 pages
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
No ratings yet
APznzaaBPbq19r7DttJsFJDiz6xdljQmPxg0oflqRAoyoqcN6IEEo4yrW Ck8XgHkH5PDMZIHRNz7h0ZpQWHOHwyjvO3PX93sVHvLd5fwcGETUu8XvmdTkaodNRbNrLgkDFPQZVQMfz8KHkZay30aqD0CVLA10PSummzrUt1vN32NEahcaq-m3CTYqZXjSBaBus9kPl5fj8KDKPT (1)
80 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
CSA Presentation
No ratings yet
CSA Presentation
37 pages
Chapter - 5 Parallel Processing
No ratings yet
Chapter - 5 Parallel Processing
117 pages
Chapter2 part 3
No ratings yet
Chapter2 part 3
27 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
34 pages
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
No ratings yet
CS82 Advanced Computer Architecture: Parallel Computer Models 1.2 Multiprocessors and Multicomputers
19 pages
Advanced Computer Architecture Unit 1
No ratings yet
Advanced Computer Architecture Unit 1
23 pages
Multi Core
No ratings yet
Multi Core
7 pages
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
No ratings yet
Shared Memory. Distributed Memory. Hybrid Distributed-Shared Memory
22 pages
COE4590_10_Flyns
No ratings yet
COE4590_10_Flyns
15 pages
Parallel Computing Unit 2 - Parallel Computing Architecture
No ratings yet
Parallel Computing Unit 2 - Parallel Computing Architecture
49 pages
8051 Arch
No ratings yet
8051 Arch
55 pages
Architecture
No ratings yet
Architecture
67 pages
Parallel Processing
No ratings yet
Parallel Processing
28 pages
Parallel_computing
No ratings yet
Parallel_computing
32 pages
Lecture 3 Flynn's Classical Taxonomy
No ratings yet
Lecture 3 Flynn's Classical Taxonomy
29 pages
Unit 1- Part 1
No ratings yet
Unit 1- Part 1
51 pages
Cs8083 MCP Unit I Notes
No ratings yet
Cs8083 MCP Unit I Notes
31 pages
CICS 504 Computer Organization
No ratings yet
CICS 504 Computer Organization
35 pages
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
No ratings yet
Parallel Computing Platforms: Chieh-Sen (Jason) Huang
28 pages
Lecture-5 Flynn Taxonomy
No ratings yet
Lecture-5 Flynn Taxonomy
17 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
Flynn's Classification
No ratings yet
Flynn's Classification
4 pages
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
No ratings yet
(SMC), (SMP), (MPP) : Symmetric Multi-Computers Symmetric Multi-Processors
13 pages
Aca Unit 1.1
No ratings yet
Aca Unit 1.1
20 pages
Module 2 - Parallel Computing
No ratings yet
Module 2 - Parallel Computing
55 pages
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
From Everand
PLC: Programmable Logic Controller – Arktika.: EXPERIMENTAL PRODUCT BASED ON CPLD.
Franco Mario
No ratings yet
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
From Everand
Fundamentals of Modern Computer Architecture: From Logic Gates to Parallel Processing
Sam Steed
No ratings yet
Public Key Infrastructure (PKI) Continued.
No ratings yet
Public Key Infrastructure (PKI) Continued.
23 pages
Parallel Processing: sp2016 Lec#3
No ratings yet
Parallel Processing: sp2016 Lec#3
23 pages
Lec 23 and 24
No ratings yet
Lec 23 and 24
33 pages
Public Key Infrastructure (PKI)
No ratings yet
Public Key Infrastructure (PKI)
23 pages
Network Security (Lec 22) : Ipsec
No ratings yet
Network Security (Lec 22) : Ipsec
34 pages
Network Security (Lec 19 and 20)
No ratings yet
Network Security (Lec 19 and 20)
44 pages
Shams Ul Arifeen
No ratings yet
Shams Ul Arifeen
1 page
Network Security Lec 11 (Message Authentication & Hash Functions)
No ratings yet
Network Security Lec 11 (Message Authentication & Hash Functions)
36 pages
Network Security (Key Management)
No ratings yet
Network Security (Key Management)
27 pages
16 and 17
No ratings yet
16 and 17
38 pages
Dip Power Epm NP Design Total GPA 4 4 3 3.5 4 63 3.705882 0 0 0 0 0 0 0 0 0 0 0 0
No ratings yet
Dip Power Epm NP Design Total GPA 4 4 3 3.5 4 63 3.705882 0 0 0 0 0 0 0 0 0 0 0 0
1 page
CV - Session
No ratings yet
CV - Session
7 pages
Computer Mouse: Jump To Navigationjump To Search
No ratings yet
Computer Mouse: Jump To Navigationjump To Search
6 pages
Opencv C++ Only
No ratings yet
Opencv C++ Only
400 pages
O/I Knapsack Problem: Design and Analysis of Algorithms Unit V Branch and Bound
No ratings yet
O/I Knapsack Problem: Design and Analysis of Algorithms Unit V Branch and Bound
11 pages
Business Process Associated With The SAP FI Module PDF
No ratings yet
Business Process Associated With The SAP FI Module PDF
2 pages
Refcardz Core Java PDF
100% (1)
Refcardz Core Java PDF
6 pages
2018 en EB The Engineer's Guide To Industrial Networking
No ratings yet
2018 en EB The Engineer's Guide To Industrial Networking
47 pages
Generating Fuzzy Rules by Learning From Examples
No ratings yet
Generating Fuzzy Rules by Learning From Examples
14 pages
Staff Training Flowchart Template
No ratings yet
Staff Training Flowchart Template
1 page
Senior Software Sales Executive in Chicago IL Resume Melanee Kretschmar
No ratings yet
Senior Software Sales Executive in Chicago IL Resume Melanee Kretschmar
2 pages
Nullpomino Readme 188223
No ratings yet
Nullpomino Readme 188223
23 pages
Excel'SGolden Rule
No ratings yet
Excel'SGolden Rule
1 page
Resume of Mohamed Obaid
No ratings yet
Resume of Mohamed Obaid
3 pages
Nfront Password Filter Documentation
No ratings yet
Nfront Password Filter Documentation
67 pages
Eli Bendersky C++11 Threads, Affinity and Hyperthreading
No ratings yet
Eli Bendersky C++11 Threads, Affinity and Hyperthreading
24 pages
Enel3de A Syncronous Sequential Design
No ratings yet
Enel3de A Syncronous Sequential Design
32 pages
Algosec Security Management Suite
No ratings yet
Algosec Security Management Suite
4 pages
AI lecture 9
No ratings yet
AI lecture 9
39 pages
Arrow Function
No ratings yet
Arrow Function
4 pages
Z OS ISPF Day 1
No ratings yet
Z OS ISPF Day 1
31 pages
CS (Boys) - Programming Fundamentals (CC-112)
100% (2)
CS (Boys) - Programming Fundamentals (CC-112)
2 pages
Free PDF Maps India
No ratings yet
Free PDF Maps India
2 pages
SOftware design chap 4
No ratings yet
SOftware design chap 4
31 pages
Monte Carlo Tree Search Method For AI Games: Volume 2, Issue 2, March - April 2013
No ratings yet
Monte Carlo Tree Search Method For AI Games: Volume 2, Issue 2, March - April 2013
6 pages
Series-Parallel DC Circuit and Verification of Kirchhoff's Law.
No ratings yet
Series-Parallel DC Circuit and Verification of Kirchhoff's Law.
6 pages
Uk Sample
No ratings yet
Uk Sample
4 pages
An 134 FTDI Drivers Installation Guide For MAC OSX
No ratings yet
An 134 FTDI Drivers Installation Guide For MAC OSX
27 pages
Programming in Python Syllabus
No ratings yet
Programming in Python Syllabus
3 pages
1 - Unit 1 - Assignment 1 Guidance
No ratings yet
1 - Unit 1 - Assignment 1 Guidance
3 pages

PP16 Lec4 Arch3

Uploaded by

PP16 Lec4 Arch3

Uploaded by

Parallel Processing

Explicitly Parallel Processor

Elements of (Explicit) Parallel

Example SIMD & MIMD systems

Ex: Conditional Execution in SIMD Processors

Executing a conditional statement on an SIMD computer with four processors:

Programing Models: MPMD/ SPMD

Comparison: SIMD vs MIMD

Elements of (Explicit) Parallel

Shared memory (SM) Multiprocessor

UMA & NUMA Arch Block Diagrams

Both are SMmultiprocessors

UMA (CSM+ SAS)

NUMA (DM+ SAS= DSM)

Typical shared-address-space architectures: (a) Uniform-memory access shared-address-space

Simplistic view of a small shared memory

Quad Pentium Shared

Multicomputer (Cluster) Platform

These platforms comprise of a set of processors

Data Exchange/Synch Approaches:

Clusters as a Computing Platform

Advantages of Cluster Computer:

Cluster Interconnects: LAN vs SAN

Comparison LAN vs SAN

Vector/ Array Data Processors

Array proc:1D- Spatial parallelism using

Summary: Parallel Platforms;

You might also like