0% found this document useful (0 votes)
28 views8 pages

Basic Parallel Programming Methods

Uploaded by

ABDUL MAJITH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
28 views8 pages

Basic Parallel Programming Methods

Uploaded by

ABDUL MAJITH
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Real-Time Systems

Lecture Topic – Review of Basic Concurrent/Parallel Programming


Dr. Sam Siewert
Electrical, Computer and Energy Engineering
Embedded Systems Engineering Program

Copyright © 2019 University of Colorado


Flynn’s Taxonomy – Parallel Systems
SISD – Single core, no vector
instructions Single Instruction/Prog Multiple Instruction
Single Data SISD (Traditional Uni- MISD
SIMD Ideal for Large Bitwise, processor)
Integer, and Floating Point
Vector Math
Multiple Data SIMD (SSE 4.2, Vector MIMD
Flynn’s Taxonomy Processing)
R-Pi 3b+/4 – MIMD SPMD (Single Program MPMD (Multi-threaded
Multi-core, NEON vector
instructions Multiple Data), GP-GPU Program, Multi-Data)

MIMD and SPMD Architecture


often leverages GP-GPU Co-
Processors

DSP VLIW (SIMD) or MIMD


(e.g. Beagle Bone AI)

2
Parallel Programming for Speed-up
Sharpen single core Demonstrations

Both are threaded, but


erast.c has semaphore
locks and sharpen does
not.

Sharpen with thread grid 1. erast.c


• Without locks
do we risk data
corruption
• Indivisible test
and set?
Can use Shared Memory with POSIX Threads – but may need locking! • Concurrent
– Locking will serialize and slow down code if sequential sections are too long reader and
– erast.c vs. erastsimp.c is a good example writer?
– Can we just run lockless?
2. sharpen_grid.c
Speed up is? – Linear?, Better?, Worse?

3
Scaling and Bottlenecks
Compiler Optimization 1 - Simple and Effective: turn on compiler optimization ~ 3x
– Turn on higher levels of optimization
– Level 3 optimization: –03 for gcc or g++
– Highest is -04, but requires feedback optimization
2 - Simple and Sometimes Effective: turn on NEON SIMD ~ 1.f x
SIMD Vector Instructions
– Turn on SIMD (NEON) instruction generation on ARM A-Series
– Flynn’s taxonomy

Using Multiple Cores 3 - Harder and Mostly Effective: Grid to Map and Reduce ~ 3.2x
– Shared Memory POSIX Threads

Combine #1, #2, and #3

Co-Processing 4 - Hardest and Highly Effective: Grid programming 128 SPs


– Linux SMP
– With advanced platforms like Jetson Nano with CUDA

~ 70x

4
Theoretical Speed-Up – Linear at Best

Speed-Up
< Linear

Due to Sequential Section


(Mapping - Split)

Compared to Parallel Section


(Gridded - Apply)

…and Due to Final Step


(Combine)

5
Parallel Processing Speed-up
Grid Data Processing Speed-up
1. Multi-Core, Multi-threaded, Macro-blocks/Frames
2. SIMD, Vector Instructions Operating over Large Words (Many Times
Instruction Set Size)
3. Co-Processor Operates in Parallel to CPU(s)

SPMD – GPU or GP-GPU Co-Processor


– PCI-Express Bus Interfaces
– Transfer Program and Data to Co-Processor S is infinite here
– Threads and Blocks to Transform Data Concurrently
1
Image Data Processing – Few Data Dependencies Max _ Speed _ Up =
– Good Speed-up by Amdahl’s Law
(1 − P) + 0
– P=Parallel Portion
– (1-P)=Sequential Portion 1
Multicore _ Speed _ Up =
– S=# of Cores (Concurrency) (1 − P) + P / S
– Overhead for Co-Processor
– IO for Co-Processing

6
Conceptual View of Hardware Resources
Three-Space View of CPU-bound HPC vs. RT or Fair
Utilization
Goal is to fully use
Requirements All resources to scale!
– CPU Margin?
– IO Latency (and CPU-Use
Bandwidth) Margin?
– Memory Capacity (and
Latency) Margin? CPU, I/O,
Mem bound
Upper Right Front Corner –
Low-Margin CPU, I/O
Mem Margin IO-Use
I/O-bound
Origin – High-Margin
Memory-Use
CPU + I/O + Memory
Bound?! – Bad day!

memory-bound

7
Copyright © 2019 University of Colorado

You might also like