0% found this document useful (0 votes)

28 views

Automatic Parallelization - 2: Y.N. Srikant

This document discusses data dependence analysis and direction vectors which are used to determine whether loops can be parallelized or vectorized. It provides examples of how direction vectors classify dependence as forward, backward, or equal and explains how this relates to parallelizing loops. It also discusses using loop transformations like scalar expansion, renaming, interchange and fission to break dependencies and increase parallelism.

Uploaded by

supriyaa

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views

Automatic Parallelization - 2: Y.N. Srikant

Uploaded by

supriyaa

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 30

Automatic Parallelization - 2

Y.N. Srikant

Department of Computer Science

Indian Institute of Science
Bangalore 560 012

NPTEL Course on Principles of Compiler Design

Y.N. Srikant Automatic Parallelization

Data Dependence Relations

Y.N. Srikant Automatic Parallelization

Data Dependence Direction Vector
Data dependence relations are augmented with a direction
of data dependence (direction vector)
There is one direction vector component for each loop in a
nest of loops
The data dependence direction vector (or direction vector)
is Ψ = (Ψ1 , Ψ2 , ..., Ψd ), where Ψk ∈ {<, =, >, ≤, ≥, 6=, ∗}
Forward or “<” direction means dependence from iteration i
to i + k (i.e., computed in iteration i and used in iteration
i + k)
Backward or “>” direction means dependence from
iteration i to i − k (i.e., computed in iteration i and used in
iteration i − k ). This is not possible in single loops and
possible in two or higher levels of nesting
Equal or “=” direction means that dependence is in the
same iteration (i.e., computed in iteration i and used in
iteration i)
Y.N. Srikant Automatic Parallelization
Direction Vector Example 1

Y.N. Srikant Automatic Parallelization

Direction Vector Example 2

Y.N. Srikant Automatic Parallelization

Direction Vector Example 3

Y.N. Srikant Automatic Parallelization

Direction Vector Example 4

Y.N. Srikant Automatic Parallelization

Data Dependence Graph and Vectorization

Individual nodes are statements of the program and edges

depict data dependence among the statements
If the DDG is acyclic, then vectorization of the program is
possible and is straightforward
Vector code generation can be done using a topological
sort order on the DDG
Otherwise, find all the strongly connected components of
the DDG, and reduce the DDG to an acyclic graph by
treating each SCC as a single node
SCCs cannot be fully vectorized; the final code will contain
some sequential loops and possibly some vector code

Y.N. Srikant Automatic Parallelization

Data Dependence Graph and Vectorization

If all the dependence relations in a loop nest have a

direction vector value of “=” for a loop, then the iterations of
that loop can be executed in parallel with no
synchronization between iterations
Any dependence with a forward (<) direction in an outer
loop will be satisfied by the serial execution of the outer
loop
If an outer loop L is run in sequential mode, then all the
dependences with a forward (<) direction at the outer level
(of L) will be automatically satisfied (even those of the
loops inner to L)
However, this is not true for those dependences with with
(=) direction at the outer level; the dependences of the
inner loops will have to be satisfied by appropriate
statement ordering and loop execution order

Y.N. Srikant Automatic Parallelization

Vectorization Example 1

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.1

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.2

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.3

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.4

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.5

Y.N. Srikant Automatic Parallelization

Vectorization Example 2.6

Y.N. Srikant Automatic Parallelization

Concurrentization Examples

Y.N. Srikant Automatic Parallelization

Loop Transformations for increasing Parallelism

Recurrence breaking
Ignorable cycles
Scalar expansion
Scalar renaming
Node splitting
Threshold detection and index set splitting
If-conversion
Loop interchanging
Loop fission
Loop fusion

Y.N. Srikant Automatic Parallelization

Scalar Expansion

Y.N. Srikant Automatic Parallelization

Scalar Expansion is not always profitable

Y.N. Srikant Automatic Parallelization

Scalar Renaming

Y.N. Srikant Automatic Parallelization

If-Conversion

Y.N. Srikant Automatic Parallelization

Loop Interchange

For machines with vector instructions, inner loops are

preferrable for vectorization, and loops can be
interchanged to enable this
For multi-core and multi-processor machines, parallel outer
loops are preferred and loop interchange may help to make
this happen
Requirements for simple loop interchange
1 The loops L1 and L2 must be tightly nested (no statements
between loops)
2 The loop limits of L2 must be invariant in L1
3 There are no statements Sv and Sw (not necessarily
∗
distinct) in L1 with a dependence Sv δ(<,>) Sw

Y.N. Srikant Automatic Parallelization

Loop Interchange for Vectorizability

Y.N. Srikant Automatic Parallelization

Loop Interchange for parallelizability

Y.N. Srikant Automatic Parallelization

Legal Loop Interchange

Y.N. Srikant Automatic Parallelization

Illegal Loop Interchange

Y.N. Srikant Automatic Parallelization

Legal but not beneficial Loop Interchange

Y.N. Srikant Automatic Parallelization

Loop Fission - Motivation

Y.N. Srikant Automatic Parallelization

Loop Fission: Legal and Illegal

Y.N. Srikant Automatic Parallelization

Jss Academy of Technical Education, BANGALORE-560060: Topic: Automatic Loop Vectorizarion in Parallel Computing
No ratings yet
Jss Academy of Technical Education, BANGALORE-560060: Topic: Automatic Loop Vectorizarion in Parallel Computing
14 pages
14-Parallelization and Automatic Parallelization-08!11!2024
No ratings yet
14-Parallelization and Automatic Parallelization-08!11!2024
50 pages
PP Unit 2 Tesseract
No ratings yet
PP Unit 2 Tesseract
38 pages
c3 Dependence Analysis p1
No ratings yet
c3 Dependence Analysis p1
32 pages
L28-Parallelization
No ratings yet
L28-Parallelization
13 pages
Week 11
No ratings yet
Week 11
7 pages
Auto Vectorization
No ratings yet
Auto Vectorization
11 pages
L19-Parallelization
No ratings yet
L19-Parallelization
11 pages
Language-Based Vectorization and Parallelization Using Intrinsics, Openmp, TBB and Cilk Plus
No ratings yet
Language-Based Vectorization and Parallelization Using Intrinsics, Openmp, TBB and Cilk Plus
12 pages
Data Level Parallelism in Smid Andvector and Gpu: BY 19PW40 S.Sayana
No ratings yet
Data Level Parallelism in Smid Andvector and Gpu: BY 19PW40 S.Sayana
18 pages
L19-20 PA Design Intro
No ratings yet
L19-20 PA Design Intro
31 pages
CompilerAutovectorizationGuide
No ratings yet
CompilerAutovectorizationGuide
41 pages
CompilerAutovectorizationGuide
No ratings yet
CompilerAutovectorizationGuide
39 pages
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
No ratings yet
Program and Network Properties 2.1 Conditions of Parallelism 2.2 Program Partitioning and Scheduling
47 pages
Wolf and Lam
No ratings yet
Wolf and Lam
38 pages
Data Dependences: CS 524 - High-Performance Computing
No ratings yet
Data Dependences: CS 524 - High-Performance Computing
20 pages
Hardware vs. Software Parallelism
50% (2)
Hardware vs. Software Parallelism
55 pages
An Introduction To Vectorization With Intel Fortran Compiler 021712
No ratings yet
An Introduction To Vectorization With Intel Fortran Compiler 021712
6 pages
An Introduction To Vectorization With Intel Fortran Compiler 021712
No ratings yet
An Introduction To Vectorization With Intel Fortran Compiler 021712
6 pages
Ui Design 100 Report
No ratings yet
Ui Design 100 Report
4 pages
Module 5 Instruction Level Parallelism and Pipelining (1)
No ratings yet
Module 5 Instruction Level Parallelism and Pipelining (1)
54 pages
Dependencies, Instruction Scheduling, Optimization, and Parallelism
No ratings yet
Dependencies, Instruction Scheduling, Optimization, and Parallelism
49 pages
c3 Dependence Analysis p2
No ratings yet
c3 Dependence Analysis p2
22 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 26-Aug-2021 Module2-SIMD-VectorProcessors
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 26-Aug-2021 Module2-SIMD-VectorProcessors
16 pages
Explicit Vector Programming in Fortran | Intel® Developer Zone
No ratings yet
Explicit Vector Programming in Fortran | Intel® Developer Zone
10 pages
CA Classes-21-25
No ratings yet
CA Classes-21-25
5 pages
Optimal Loop Parallelization For Maximizing Iteration-Level Parallelism
No ratings yet
Optimal Loop Parallelization For Maximizing Iteration-Level Parallelism
10 pages
Dependency-Based Automatic Parallelization of Java Applications
No ratings yet
Dependency-Based Automatic Parallelization of Java Applications
13 pages
Intro To Parallel Computing
No ratings yet
Intro To Parallel Computing
127 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Clase01 - Introducción Al Paralelismo
No ratings yet
Clase01 - Introducción Al Paralelismo
30 pages
Advanced Computer Architecture: Conditions of Parallelism
No ratings yet
Advanced Computer Architecture: Conditions of Parallelism
27 pages
OpenACC Fundamentals
No ratings yet
OpenACC Fundamentals
38 pages
Lecture 6 Principles of Parallel Algorithm Design
No ratings yet
Lecture 6 Principles of Parallel Algorithm Design
35 pages
Computer Achitecture II - Parallel - Computing
No ratings yet
Computer Achitecture II - Parallel - Computing
46 pages
Parallel Architectures Parallel Architectures: Ever Faster
No ratings yet
Parallel Architectures Parallel Architectures: Ever Faster
11 pages
Literature Review Samples
No ratings yet
Literature Review Samples
2 pages
Topic2c Ss Dynamicscheduling
No ratings yet
Topic2c Ss Dynamicscheduling
94 pages
Parallel Programming 1
No ratings yet
Parallel Programming 1
32 pages
Ca Part 3
No ratings yet
Ca Part 3
20 pages
Dependency Analysis of For-Loop Structures For Automatic Parallelization of C Code
No ratings yet
Dependency Analysis of For-Loop Structures For Automatic Parallelization of C Code
13 pages
Vector
No ratings yet
Vector
38 pages
Parallel Computing Unit 3 - Principles of Parallel Computing Design
No ratings yet
Parallel Computing Unit 3 - Principles of Parallel Computing Design
78 pages
Principles of Parallel Algorithm Design
No ratings yet
Principles of Parallel Algorithm Design
63 pages
Overview of Parallel Programming in C++ - Pablo Halpern - CppCon 2014
No ratings yet
Overview of Parallel Programming in C++ - Pablo Halpern - CppCon 2014
37 pages
L1.3a HPC Concepts
No ratings yet
L1.3a HPC Concepts
43 pages
Data-Parallel Architectures and
No ratings yet
Data-Parallel Architectures and
27 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
36 pages
SIMD
No ratings yet
SIMD
44 pages
09 ParallelizationRecap PDF
No ratings yet
09 ParallelizationRecap PDF
62 pages
04_progbasics
No ratings yet
04_progbasics
51 pages
Computer Architecture Simd Vector Gpu
No ratings yet
Computer Architecture Simd Vector Gpu
16 pages
Lecture 5 Principles of Parallel Algorithm Design
No ratings yet
Lecture 5 Principles of Parallel Algorithm Design
30 pages
ACA Unit 8 Hardware and Software For VLIW and EPIC Notes - Unit 8
No ratings yet
ACA Unit 8 Hardware and Software For VLIW and EPIC Notes - Unit 8
35 pages
Parallel Processing Chapter - 2
0% (1)
Parallel Processing Chapter - 2
135 pages
CS 6290 Instruction Level Parallelism
No ratings yet
CS 6290 Instruction Level Parallelism
45 pages
Auto-Vectorization With The Intel Compilers: Is Your Code Ready For Sandy Bridge and Knights Corner?
No ratings yet
Auto-Vectorization With The Intel Compilers: Is Your Code Ready For Sandy Bridge and Knights Corner?
12 pages
Using GCC Auto-Vectorizer
No ratings yet
Using GCC Auto-Vectorizer
15 pages
Module 1: PARALLEL AND DISTRIBUTED COMPUTING
No ratings yet
Module 1: PARALLEL AND DISTRIBUTED COMPUTING
65 pages
Introduction to Advanced Mathematical Analysis
From Everand
Introduction to Advanced Mathematical Analysis
Simone Malacrida
No ratings yet
PHAR Jeeva
No ratings yet
PHAR Jeeva
19 pages
2nd Review Jeeva
No ratings yet
2nd Review Jeeva
14 pages
Chap9 PDF
No ratings yet
Chap9 PDF
37 pages
Chap12 PDF
No ratings yet
Chap12 PDF
53 pages
Chap11 PDF
No ratings yet
Chap11 PDF
84 pages
Detecting Cyber Threats Through Social Network Ana PDF
No ratings yet
Detecting Cyber Threats Through Social Network Ana PDF
16 pages
Active Databases: PSG College of Technology, Coimbatore - 4 Department of Applied Mathematics and Computational Sciences
No ratings yet
Active Databases: PSG College of Technology, Coimbatore - 4 Department of Applied Mathematics and Computational Sciences
3 pages
Seminar Report 032
No ratings yet
Seminar Report 032
32 pages
White Paper - OTT Platform
No ratings yet
White Paper - OTT Platform
17 pages
FURUNO BR500 Operator Manual
No ratings yet
FURUNO BR500 Operator Manual
93 pages
Selected Poems of Toru Dutt
No ratings yet
Selected Poems of Toru Dutt
181 pages
Đề thi - Marketing Quốc tế (TA) - CLC - HK2 - 2023
No ratings yet
Đề thi - Marketing Quốc tế (TA) - CLC - HK2 - 2023
3 pages
10 ChatPT prompts
No ratings yet
10 ChatPT prompts
14 pages
Cruz (Velocity Micro) Tablets Guide
No ratings yet
Cruz (Velocity Micro) Tablets Guide
4 pages
pamphlet texmic 2024
No ratings yet
pamphlet texmic 2024
2 pages
The Impossible Game
No ratings yet
The Impossible Game
6 pages
MTH 201 Practice Test #2 - SOLUTIONS
No ratings yet
MTH 201 Practice Test #2 - SOLUTIONS
9 pages
Photoelectric Smoke Sensors MIX-4010 / MIX-4010-ISO: Features
No ratings yet
Photoelectric Smoke Sensors MIX-4010 / MIX-4010-ISO: Features
2 pages
Percona Monitoring and Management Documentation
No ratings yet
Percona Monitoring and Management Documentation
35 pages
Online Szamla - Interfesz Specifikáció - EN - v3.0 PDF
No ratings yet
Online Szamla - Interfesz Specifikáció - EN - v3.0 PDF
324 pages
Next-Generation Vessel Traffic Services SystemsFrom Passive To Proactive
No ratings yet
Next-Generation Vessel Traffic Services SystemsFrom Passive To Proactive
16 pages
Blender Iq
No ratings yet
Blender Iq
25 pages
Log
No ratings yet
Log
1,398 pages
CitiDirect BE Account Status Certificate Report User Guide
No ratings yet
CitiDirect BE Account Status Certificate Report User Guide
3 pages
Electronic Bank Statement in Sap Fi: 06/11/2015 Venkat
No ratings yet
Electronic Bank Statement in Sap Fi: 06/11/2015 Venkat
13 pages
Linux Useradd Command
No ratings yet
Linux Useradd Command
33 pages
Protocolo ISIS
No ratings yet
Protocolo ISIS
49 pages
EL-6 Programming Manual - B&amp C Technologies
No ratings yet
EL-6 Programming Manual - B&amp C Technologies
36 pages
Cpa Review School of The Philippines Mani La
No ratings yet
Cpa Review School of The Philippines Mani La
2 pages
Pedestal Crane Analisist
No ratings yet
Pedestal Crane Analisist
40 pages
Generative AI
No ratings yet
Generative AI
1 page
Ridesharing 1
No ratings yet
Ridesharing 1
86 pages
Box and Whisker Plots
No ratings yet
Box and Whisker Plots
2 pages
Thermo Scientific - Niton XL3t GOLDD
No ratings yet
Thermo Scientific - Niton XL3t GOLDD
2 pages
Radio-Frequency Block Arrangements For Fixed Wireless Access Systems in The Range 10.15-10.3/10.5-10.65 GHZ
No ratings yet
Radio-Frequency Block Arrangements For Fixed Wireless Access Systems in The Range 10.15-10.3/10.5-10.65 GHZ
5 pages
Non Fiction - Non-Fiction - Audiobooks - Free Download Mp3 Non Fiction
No ratings yet
Non Fiction - Non-Fiction - Audiobooks - Free Download Mp3 Non Fiction
6 pages
Song Mixing Secrets
No ratings yet
Song Mixing Secrets
95 pages