Handbook HPC 23-24
Handbook HPC 23-24
Course Objectives:
• To understand different parallel programming models
• To analyze the performance and modeling of parallel programs
• To illustrate the various techniques to parallelize the algorithm
• To implement parallel communication operations.
• To discriminate CUDA Architecture and its components.
• To Understand Scope of Parallel Computing and its search algorithms.
Course Outcomes:
CO1: Understand various Parallel Paradigm
CO2: Design and Develop an efficient parallel algorithm to solve given problem
CO3: Illustrate data communication operations on various parallel architecture
CO4: Analyze and measure performance of modern parallel computing systems
CO5: Apply CUDA architecture for parallel programming
CO6: Analyze the performance of HPC applications
Course Contents
Unit I :Introduction to Parallel Computing 07 Hours
Introduction to Parallel Computing: Motivating Parallelism, Modern Processor: Stored-program
computer architecture, General-purpose Cache-based Microprocessor architecture. Parallel Programming
Platforms: Implicit Parallelism, Dichotomy of Parallel Computing Platforms, Physical Organization of
Parallel Platforms, Communication Costs in Parallel Machines. Levels of parallelism, Models: SIMD,
MIMD, SIMT, SPMD, Data Flow Models, Demand-driven Computation, Architectures: N-wide
superscalar architectures, multi-core, multi-threaded.
#Exemplar/Case Studies Case study: Multi-core System
*Mapping of Course Outcomes for Unit I - CO1
e Books :
1.http://prdrklaina.weebly.com/uploads/5/7/7/3/5773421/introduction_to_high_performance_computing_f
or_scientists_and_engineers.pdf
2. https://www.vssut.ac.in/lecture_notes/lecture1428643084.pdf
Platforms, Physical
4.
Organization of Parallel
Platforms, Communication
Costs in Parallel Machines
Levels of parallelism,
5. Models: SIMD, MIMD,
SIMT ,SPMD
7. Architectures: N-wide
superscalar architectures,
multi-core, multi-threaded.
Make up Classes
PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
8. Principles of Parallel
Algorithm Design:
Preliminaries
9.
Decomposition Techniques
Make up Classes
Make up Classes
PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
29. Introduction to GPU:
Introduction to GPU
Architecture overview
30.
Introduction to CUDA C
31. CUDA programming model,
write and launch a CUDA
kernel
32. Handling Errors in CUDA
C
33.
CUDA memory model
34. Manage communication and
synchronization
35. Parallel programming in
CUDA- C.
Make up Classes
SUMMARY
No. of lectures allotted by university 42
BE Computer – SEM II
A Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4
Marks CO PO
Q.1 a Write a note on : Bus- Based Networks, Crossbar Networks, Fully 6 CO3 PO1
Connected Network, Meshes and Tree Based Networks
b Explain Dichotomy of Parallel Computing Platforms 5 CO3 PO1
B Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4
Marks CO PO
Q.1 a Write a note on : Bus- Based Networks, Crossbar Networks, Fully 6 CO3 PO3
Connected Network, Meshes and Tree Based Networks
b Differentiate between NUMA and UMA 5 CO3 PO1
A Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4
Marks CO PO
Q.1 a Describe the following for one-to-all broadcast and all-to-one reduction 6 CO3 PO3
communication operations: Linear Array, Mesh, HyperCube.
b Explain Prefix sum operation for eight node hypercube. 5 CO3 PO1
c Explain the effects of granularity on the performance of a parallel system. 4 CO4 PO2
OR
Q.4 a Explain Parallel Matrix-Matrix Multiplication algorithm with example 6 CO4 PO3
b Explain Different performance Matrix for parallel systems 5 CO4 PO1
c Explain “Scaling Down (downsizing)” a parallel system with example. 4 CO4 PO1
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING
B Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4
Marks CO PO
Q.1 a Explain Broadcast and Reduce Operation with the help of diagram. 6 CO3 PO3
b Write a short note on All-to-one reduction with suitable example. 5 CO3 PO1
c Write a note on: Total Exchange on a Ring and Mesh. 4 CO3 PO1
OR
Q.2 a Explain all-reduce and prefix sum operations 6 CO3 PO1
A Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4 , Q.5 or Q.6,
Q.7 or Q.8.
Marks CO PO
Q.1 a Explain with diagram One-to-all broadcast on an eight-node ring with 7 CO3 PO3
recursive doubling technique. Node 0 is the source of the broadcast. Also
Explain all to one reduction with node 0 as destination
b Explain scatter and gather communication operation with diagram 6 CO3 PO1
c Explain circular shift operation? 4 CO3 PO1
OR
Q.2 a What is all to all broadcast communication operation? Explain all to all 7 CO3 PO2
broadcast on an eight node ring with step wise diagrams. (Show first two
steps and last communication step).
b Explain in detail Blocking and Non-Blocking Communication Using MPI 6 CO3 PO1
c Write a short note on prefix-sum operation. 4 CO3 PO1
Q.7 a Explain different Communication Strategies in parallel Best-First Tree 8 CO6 PO2
Search.
b Explain odd even transportation in bubble sort using parallel formulation. 6 CO6 PO2
c Explain Parallel Depth First Search algorithm in detail? 4 CO6 PO1
OR
Q.8 a What are issues in sorting on parallel computers ? explain with appropriate 8 CO6 PO1
example
b What is kubernets? Explain its features and applications 6 CO6 PO2
c Write a short note on 4 CO6 PO2
1) Parallel Merge sort
2) GPU applications
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING
B Set
Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4 & Q.5 or Q.6
& Q.7 or Q.8.
Marks CO PO
Q.1 a Briefly explain one to-all broadcast and all-to-one reduction on eight node 7 CO3 PO3
hypercube How to find the cost of communication for one to all broadcast on
eight node hypercube
b Explain different approaches of communication operations .What is total 6 CO3 PO1
Exchange Method?
c Explain Improving the speed of some communication operations 4 CO3 PO1
OR
Q.2 a With suitable diagram and example, explain All-to-All Broadcast and All-to- 7 CO3 PO2
All Reduction
b Explain scatter and gather communication operation with diagram 6 CO3 PO1
c Write a short note on prefix-sum operation. 4 CO3 PO1
Q.7 a What are issues in sorting on parallel computers ? explain with appropriate 8 CO6 PO2
example.
b Explain the terms: 6 CO6 PO1
i. Bitonic Sequence
ii. Bitonic Sort
iii. Bitonic Merge
iv. Bitonic Split with example
c Explain Parallel Depth-First Search with example 4 CO6 PO2
Q.8 a Explain odd even transportation in bubble sort using parallel formulation. 8 CO6 PO2
Give one stepwise example solution using odd-even transportation
b What is kubernets? Explain its features and applications 6 CO6 PO1
c Indicate the sorting issues in parallel computers 4 CO6 PO1