0% found this document useful (0 votes)

163 views18 pages

Handbook HPC 23-24

Uploaded by

Coading Captain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

163 views18 pages

Handbook HPC 23-24

Uploaded by

Coading Captain

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 18

Savitribai Phule Pune University

Fourth Year of Computer Engineering (2019 Course)

410250: High Performance Computing
Teaching Scheme: Credit Examination Scheme:
TH: 3 Hours/Week 3 In- Sem (TH) : 30
End- Sem (TH): 70

Prerequisites Courses: -Microprocessor (210254), Principles of Programming

Languages(210255), Computer Networks and Security(310244)
Companion Course: Laboratory Practice V(410254)

Course Objectives:
• To understand different parallel programming models
• To analyze the performance and modeling of parallel programs
• To illustrate the various techniques to parallelize the algorithm
• To implement parallel communication operations.
• To discriminate CUDA Architecture and its components.
• To Understand Scope of Parallel Computing and its search algorithms.

Course Outcomes:
CO1: Understand various Parallel Paradigm
CO2: Design and Develop an efficient parallel algorithm to solve given problem
CO3: Illustrate data communication operations on various parallel architecture
CO4: Analyze and measure performance of modern parallel computing systems
CO5: Apply CUDA architecture for parallel programming
CO6: Analyze the performance of HPC applications

Course Contents
Unit I :Introduction to Parallel Computing 07 Hours
Introduction to Parallel Computing: Motivating Parallelism, Modern Processor: Stored-program
computer architecture, General-purpose Cache-based Microprocessor architecture. Parallel Programming
Platforms: Implicit Parallelism, Dichotomy of Parallel Computing Platforms, Physical Organization of
Parallel Platforms, Communication Costs in Parallel Machines. Levels of parallelism, Models: SIMD,
MIMD, SIMT, SPMD, Data Flow Models, Demand-driven Computation, Architectures: N-wide
superscalar architectures, multi-core, multi-threaded.
#Exemplar/Case Studies Case study: Multi-core System
*Mapping of Course Outcomes for Unit I - CO1

Unit II :Parallel Algorithm Design 07 Hours

Principles of Parallel Algorithm Design: Preliminaries, Decomposition Techniques, Characteristics of
Tasks and Interactions, Mapping Techniques for Load Balancing, Methods for Containing Interaction
Overheads, Parallel Algorithm Models: Data, Task, Work Pool and Master Slave Model, Complexities:
Sequential and Parallel Computational Complexity, Anomalies in Parallel Algorithms.
#Exemplar/Case Studies Foster's parallel algorithm design methodology.
(http://compsci.hunter.cuny.edu/~sweiss/course_materials/csci493.65/lecture_notes/chapter03.pdf)
*Mapping of Course Outcomes for Unit II CO2

Unit III : Parallel Communication 07 Hours

Unit IV :Analytical Modeling of Parallel Programs 07 Hours

Sources of Overhead in Parallel Programs, Performance Measures and Analysis: Amdahl's and Gustafson's
Laws, Speedup Factor and Efficiency, Cost and Utilization, Execution Rate and Redundancy, The Effect of
Granularity on Performance, Scalability of Parallel Systems, Minimum Execution Time and Minimum
Cost, Optimal Execution Time, Asymptotic Analysis of Parallel Programs. Matrix Computation: Matrix-
Vector Multiplication, Matrix-Matrix
Multiplication.
#Exemplar/Case Studies The DAG Model of parallel computation
*Mapping of Course Outcomes for Unit IV CO4

Unit V : CUDA Architecture 07 Hours

Introduction to GPU: Introduction to GPU Architecture overview, Introduction to CUDA C- CUDA
programming model, write and launch a CUDA kernel, Handling Errors, CUDA memory model, Manage
communication and synchronization, Parallel programming in CUDA- C.
#Exemplar/Case Studies GPU applications using SYCL and CUDA on NVIDIA
*Mapping of Course Outcomes for Unit V CO5

Unit VI : High Performance Computing Applications 07 Hours

Scope of Parallel Computing, Parallel Search Algorithms: Depth First Search(DFS), Breadth First Search(
BFS), Parallel Sorting: Bubble and Merge, Distributed Computing: Document classification, Frameworks –
Kuberbets, GPU Applications, Parallel Computing for AI/ ML
#Exemplar/Case Studies Disaster detection and management/ Smart Mobility/ Urban planning
*Mapping of Course Outcomes for Unit VI CO6
Learning Resources
Text Books:
1. Ananth Grama, Anshul Gupta, George Karypis, and Vipin Kumar, "Introduction to Parallel
Computing", 2nd edition, Addison-Wesley, 2003, ISBN: 0-201-64865-2
2. Seyed H. Roosta, “Parallel Processing and Parallel Algorithms Theory and Computation‖”, Springer-
Verlag 2000 ,ISBN 978-1-4612-7048-5 ISBN 978-1-4612-1220-1
3. John Cheng, Max Grossman, and Ty McKercher, “Professional CUDA C Programming”, John Wiley &
Sons, Inc., ISBN: 978-1-118-73932-7
Reference Books :
1. Kai Hwang,, "Scalable Parallel Computing", McGraw Hill 1998.
2. George S. Almasi and Alan Gottlieb, "Highly Parallel Computing", The Benjamin and Cummings Pub.
Co., Inc
3. Jason sanders, Edward Kandrot, “CUDA by Example”, Addison-Wesley, ISBN-13: 978- 0-13-138768-3
4. Pacheco, Peter S., “An Introduction to Parallel Programming”, Morgan Kaufmann Publishers ISBN
978-0-12-374260-5
5. Rieffel WH.EG, Polak, “Quantum Computing: A gentle introduction”, MIT Press, 2011, ISBN 978-0-
262-01506-6
6. Ajay D. Kshemkalyani , Mukesh Singhal, “Distributed Computing: Principles, Algorithms, and
Systems”, Cambridge March 2011, ISBN: 9780521189842

e Books :
1.http://prdrklaina.weebly.com/uploads/5/7/7/3/5773421/introduction_to_high_performance_computing_f
or_scientists_and_engineers.pdf
2. https://www.vssut.ac.in/lecture_notes/lecture1428643084.pdf

NPTEL/YouTube video lecture link

● https://nptel.ac.in/courses/106108055
● https://www.digimat.in/nptel/courses/video/106104120/L01.html
Sinhgad Technical Education Society’s

RMD SINHGAD SCHOOL OF ENGINEERING, PUNE

Department of Computer Engineering
TEACHING PLAN
Academic Year: 2023-24(Semester: VIII)
Course Title: Subject Code: Class: Division:
High Performance Computing B.E. A and B
410250
Term: I Date of commencement of classes: Date of conclusion of teaching:
11/12/2023 19/04/2024
Lecture Examination Scheme
Practical/Tutorial
Schedule: Theory:100 M Term Work Practical Oral
Schedule:
3Hrs/ In Sem:30 M (1 Hr)
Week End Sem: 70(2Hrs.30 min.) - - -

Subject Mrs.P.V.Kasture Previous 3 Years University 2020-21 2021-22 2022-23

Teacher Mrs. H. Kumbhar Result

UNIT – I: Introduction to parallel computing (07 Hours )

Introduction to Parallel Computing: Motivating Parallelism, Modern Processor: Stored-
program computer architecture, General-purpose Cache-based Microprocessor architecture.
Parallel Programming Platforms: Implicit Parallelism, Dichotomy of Parallel Computing
Platforms, Physical Organization of Parallel Platforms, Communication Costs in Parallel
Machines. Levels of parallelism, Models: SIMD, MIMD, SIMT, SPMD, Data Flow Models,
Demand-driven Computation, Architectures: N-wide superscalar architectures, multi-core,
multi-threaded.
PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
Introduction to Parallel
1. Computing: Motivating
Parallelism,
Modern Processor: Stored-
program computer
2. architecture, General-purpose
Cache-based Microprocessor
architecture
Parallel Programming
Platforms: Implicit
3. Parallelism, Dichotomy of
Parallel Computing Model

Platforms, Physical
4.
Organization of Parallel
Platforms, Communication
Costs in Parallel Machines

Levels of parallelism,
5. Models: SIMD, MIMD,
SIMT ,SPMD

Data Flow Models, Demand-

6. driven Computation,

7. Architectures: N-wide
superscalar architectures,
multi-core, multi-threaded.

Make up Classes

Contents Beyond syllabus

UNIT –II: Parallel Algorithm Design (07 Hours )

Principles of Parallel Algorithm Design: Preliminaries, Decomposition Techniques Characteristics
of Tasks and Interactions, Mapping Techniques for Load Balancing Methods for Containing
Interaction Overheads Methods for Containing Interaction Overheads ,Parallel Algorithm Models:
Data, Task, Work Pool and Master Slave Model Complexities: Sequential and Parallel
Computational Complexity Anomalies in Parallel Algorithms.

PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
8. Principles of Parallel
Algorithm Design:
Preliminaries
9.
Decomposition Techniques

10. Characteristics of Tasks and

Interactions, Mapping
Techniques for Load
Balancing
11. Methods for Containing
Interaction Overheads
12. Parallel Algorithm Models:
Data, Task, Work Pool and
Master Slave Model
13. Complexities: Sequential
and Parallel Computational
Complexity
14. Anomalies in Parallel
Algorithms.

Make up Classes

Contents Beyond syllabus

UNIT –III: Parallel Communication 07 Hours

Basic Communication: One-to-All Broadcast, All-to-One Reduction, All-to-All Broadcast and
Reduction, All-Reduce and Prefix-Sum Operations, Collective Communication using MPI: Scatter,
Gather, Broadcast, Blocking and non blocking MPI, All-to-All Personalized Communication,
Circular Shift, Improving the speed of some communication operations.
PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
Basic Communication: One-
15. to-All Broadcast, All-to-One
Reduction
All-to-All Broadcast and
16.
Reduction
All-Reduce and Prefix-Sum
17.
Operations
Collective Communication
18.
using MPI: Scatter
Gather, Broadcast, Blocking
19.
, and non blocking MPI
All-to-All Personalized
20. Communication

Circular Shift, Improving

21. the speed of some
communication operations
Make up Classes

Contents Beyond syllabus

UNIT –IV: Analytical Modeling of Parallel Programs(07 Hours)
Sources of Overhead in Parallel Programs, Performance Measures and Analysis: Amdahl's
and Gustafson's Laws, Speedup Factor and Efficiency, Cost and Utilization, Execution Rate
and Redundancy, The Effect of Granularity on Performance, Scalability of Parallel
Systems, Minimum Execution Time and Minimum Cost, Optimal Execution Time,
Asymptotic Analysis of Parallel Programs. Matrix Computation: Matrix-Vector
Multiplication, Matrix-Matrix Multiplication.
PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
22. Sources of Overhead in
Parallel Programs
23. Performance Measures and
Analysis: Amdahl's and
Gustafson's Laws,
24. Speedup Factor and
Efficiency, Cost and
Utilization, Execution Rate
25. and Redundancy, The
Effect of Granularity on
Performance, Scalability of
Parallel Systems
26. Minimum Execution Time
and Minimum Cost,
Optimal Execution Time,
27. Analysis of Parallel
Programs.
28. Matrix Computation:
Matrix-Vector
Multiplication, Matrix-
Matrix
Multiplication.

Make up Classes

Contents Beyond syllabus

UNIT –V: CUDA Architecture (07 Hours)

PLAN ACTUAL
Reasons
Lect.
Date Topics Date Topics covered for
No.
Deviation
29. Introduction to GPU:
Introduction to GPU
Architecture overview
30.
Introduction to CUDA C
31. CUDA programming model,
write and launch a CUDA
kernel
32. Handling Errors in CUDA
C
33.
CUDA memory model
34. Manage communication and
synchronization
35. Parallel programming in
CUDA- C.

Make up Classes

Contents Beyond syllabus

UNIT –VI: High Performance Computing Applications (07 Hours)

38. Breadth First Search( BFS)

39. Parallel Sorting: Bubble Sort

40. Parallel Sorting: Merge Sort

Distributed Computing:
41. Document classification,
Frameworks – Kuberbets
GPU Applications, Parallel
42.
Computing for AI/ ML
Make up Classes

Contents Beyond syllabus

Date: Name and Sign of Subject Teacher Head of Department

SUMMARY
No. of lectures allotted by university 42

Total no. of lectures conducted

Percentage of syllabus covered

Total no. of makeup classes 00

Date: Name and Sign of Subject Teacher Head of Department

Subject- High Performance Computing

BE Computer – SEM II

Course outcomes from SPPU syllabus

On successful completion of this course, the learner will be able to:

CO1 To understand different parallel programming models

CO2 To analyze the performance and modeling of parallel programs

CO3 To illustrate the various techniques to parallelize the algorithm

CO4 To implement parallel communication operations.

CO5 To discriminate CUDA Architecture and its components.

CO6 To Understand Scope of Parallel Computing and its search algorithms.

PO of Computer Engineering Department

PO1- Engineering knowledge: Apply the knowledge of mathematics, science, engineering fundamentals,
and an engineering specialization to the solution of complex engineering problems.
PO2- Problem analysis: Identify, formulate, review research literature, and analyze complex engineering
problems reaching substantiated conclusions using first principles of mathematics, natural sciences, and
engineering sciences.
PO3-. Design/development of solutions: Design solutions for complex engineering problems and design
system components or processes that meet the specified needs with appropriate consideration for the
public health and safety, and the cultural, societal, and environmental considerations.
PO4- Conduct investigations of complex problems: Use research-based knowledge and research methods
including design of experiments, analysis and interpretation of data, and synthesis of the information to
provide valid conclusions.
PO5- Modern tool usage: Create, select, and apply appropriate techniques, resources, and modern
engineering and IT tools including prediction and modeling to complex engineering activities with an
understanding of the limitations.
PO6- The engineer and society: Apply reasoning informed by the contextual knowledge to assess societal,
health, safety, legal and cultural issues and the consequent responsibilities relevant to the professional
engineering practice.
PO7- Environment and sustainability: Understand the impact of the professional engineering solutions in
societal and environmental contexts, and demonstrate the knowledge of, and need for sustainable
development.
PO8- Ethics: Apply ethical principles and commit to professional ethics and responsibilities and norms of
the engineering practice.
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Unit Test I 2023-24

A Set

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4

Marks CO PO
Q.1 a Write a note on : Bus- Based Networks, Crossbar Networks, Fully 6 CO3 PO1
Connected Network, Meshes and Tree Based Networks
b Explain Dichotomy of Parallel Computing Platforms 5 CO3 PO1

c Explain following architecture in detail, 4 CO3 PO1

i.N wide superscalar architecture.
ii. Multi-threaded architecture.
OR
Q.2 a Write about Scope of Parallelism. 6 CO3 PO1

b Explain Communication Model of Parallel Platform 5 CO3 PO1

c Write Short Note on Motivation for Parallelism. 4 CO3 PO1

Q.3 a Explain Principles of Parallel Algorithm Design 6 CO4 PO1

b Explain the Characteristics of Tasks and Interactions. 5 CO4 PO2
c Explain Decomposition, Tasks and Dependency Graphs. 4 CO4 PO2
OR
Q.4 a Explain Parallel Algorithm Models 6 CO4 PO1
b Explain the different methods for containing Interaction Overheads. 5 CO4 PO1
c Explain with suitable example for each, 4 CO4 PO1
1. Recursive Decomposition
2. Data Decomposition
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Unit Test I 2023-24

B Set

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4

Marks CO PO
Q.1 a Write a note on : Bus- Based Networks, Crossbar Networks, Fully 6 CO3 PO3
Connected Network, Meshes and Tree Based Networks
b Differentiate between NUMA and UMA 5 CO3 PO1

c Explain multi-core processor architecture. 4 CO3 PO1

OR
Q.2 a Explain the Basic Working Principle of VLIW Processor. 6 CO3 PO1
b Explain Physical Organization of Parallel Platforms 5 CO3 PO1
c Explain N wide Superscalar Architecture 4 CO3 PO1

Q.3 a Explain recursive decomposition with suitable example 6 CO4 PO1

b Explain characteristics of task w.r.t. following, 5 CO4 PO2
Task Generation
Task Sizes
Size of Data associated with tasks
c Explain randomized block distribution and hierarchical mappings 4 CO4 PO2
OR
Q.4 a What are different Mapping techniques for Load Balancing 6 CO4 PO1
b Explain the different methods for containing Interaction Overheads. 5 CO4 PO1
c Explain Decomposition, Tasks and Dependency Graphs. 4 CO4 PO1
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Unit Test II 2023-24

A Set

Subject- High Performance Computing

Duration -Time: 1 Hour. Maximum Marks -30

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4

Marks CO PO
Q.1 a Describe the following for one-to-all broadcast and all-to-one reduction 6 CO3 PO3
communication operations: Linear Array, Mesh, HyperCube.
b Explain Prefix sum operation for eight node hypercube. 5 CO3 PO1

c Explain the concept of Scatter and Gather. 4 CO3 PO1

OR
Q.2 a What is all to all broadcast communication operation? Explain all to all 6 CO3 PO1
broadcast on an eight node ring with step wise diagrams. (Show first two
steps and last communication step).
b Explain Circular shift operation on mesh and hypercube network 5 CO3 PO1
c Explain in detail Blocking and Non-Blocking Communication Using MPI 4 CO3 PO1

Q.3 a Explain various sources of overhead in parallel systems? 6 CO4 PO1

b Write a note on Minimum and cost optimal execution time 5 CO4 PO2

c Explain the effects of granularity on the performance of a parallel system. 4 CO4 PO2

OR
Q.4 a Explain Parallel Matrix-Matrix Multiplication algorithm with example 6 CO4 PO3
b Explain Different performance Matrix for parallel systems 5 CO4 PO1
c Explain “Scaling Down (downsizing)” a parallel system with example. 4 CO4 PO1
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Unit Test II 2023-24

B Set

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4

Marks CO PO
Q.1 a Explain Broadcast and Reduce Operation with the help of diagram. 6 CO3 PO3
b Write a short note on All-to-one reduction with suitable example. 5 CO3 PO1

c Write a note on: Total Exchange on a Ring and Mesh. 4 CO3 PO1
OR
Q.2 a Explain all-reduce and prefix sum operations 6 CO3 PO1

b Explain all-to-all personalized communication and its applications 5 CO3 PO1

c How to improve speed of communication operations 4 CO3 PO1
OR
Q.3 a Explain the performance Metrics for Parallel Systems 6 CO4 PO1
b Explain Matrix-Vector Multiplication 5 CO4 PO2
c What is scalability Characteristics of parallel programs? 4 CO4 PO2
OR
Q.4 a What are the different partitioning techniques used in Matrix-vector 6 CO4 PO3
Multiplication?
b Explain Cannon’s Algorithm for Matrix - matrix multiplication in detail. . 5 CO4 PO3
c Explain sources of overhead in parallel systems. 4 CO4 PO3
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Preliminary Examination 2023-2024

A Set

Subject- High Performance Computing

Duration -Time : 2½ Hours . Maximum Marks -70

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4 , Q.5 or Q.6,
Q.7 or Q.8.

Marks CO PO
Q.1 a Explain with diagram One-to-all broadcast on an eight-node ring with 7 CO3 PO3
recursive doubling technique. Node 0 is the source of the broadcast. Also
Explain all to one reduction with node 0 as destination
b Explain scatter and gather communication operation with diagram 6 CO3 PO1
c Explain circular shift operation? 4 CO3 PO1
OR
Q.2 a What is all to all broadcast communication operation? Explain all to all 7 CO3 PO2
broadcast on an eight node ring with step wise diagrams. (Show first two
steps and last communication step).

b Explain in detail Blocking and Non-Blocking Communication Using MPI 6 CO3 PO1
c Write a short note on prefix-sum operation. 4 CO3 PO1

Q.3 a Explain various sources of overhead in parallel systems? 7 CO4 PO1

b What is granularity? What are effects of granularity on performance of 6 CO4 PO2
parallel systems
c Explain “Scaling Down (downsizing)” a parallel system with example. 4 CO4 PO2
OR
Q.4 a Explain Parallel Matrix-Matrix Multiplication algorithm m with example 7 CO4 PO3
b Explain Different performance Matrix for parallel systems 6 CO4 PO1
c What are scalability Characteristics of parallel programs? 4 CO4 PO1
Q.5 a What is kernel in CUDA? What is kernel launch? Explain arguments that can 8 CO5 PO3
be specified in a kernel launch.
b Draw the diagram for CUDA memory hierarchy 6 CO5 PO1
c Describe processing flow of CUDA-C program with diagram 4 CO5 PO2
OR
Q.6 a What is CUDA? Explain different programming language support in CUDA. 8 CO5 PO2
Discuss any three applications of CUDA
b Justify Parallel Programming in CUDA-C 6 CO5 PO1
c Explain the following terms in CUDA: device, host, device code, 4 CO5 PO1
Kernel.

Q.7 a Explain different Communication Strategies in parallel Best-First Tree 8 CO6 PO2
Search.
b Explain odd even transportation in bubble sort using parallel formulation. 6 CO6 PO2
c Explain Parallel Depth First Search algorithm in detail? 4 CO6 PO1
OR
Q.8 a What are issues in sorting on parallel computers ? explain with appropriate 8 CO6 PO1
example
b What is kubernets? Explain its features and applications 6 CO6 PO2
c Write a short note on 4 CO6 PO2
1) Parallel Merge sort
2) GPU applications
SINHGAD TECHNICAL EDUCATION SOCIETY'
RMD SINHGAD SCHOOL OF ENGINEERING WARJE -58
DEPARTMENT OF COMPUTER ENGINEERING

Preliminary Examination 2023-24

B Set

Subject- High Performance Computing

Duration -Time : 2½ Hours . Maximum Marks -70

Instructions- Solve any one question from Q.1 or Q.2 & Q.3 or Q.4 & Q.5 or Q.6
& Q.7 or Q.8.

Marks CO PO
Q.1 a Briefly explain one to-all broadcast and all-to-one reduction on eight node 7 CO3 PO3
hypercube How to find the cost of communication for one to all broadcast on
eight node hypercube
b Explain different approaches of communication operations .What is total 6 CO3 PO1
Exchange Method?
c Explain Improving the speed of some communication operations 4 CO3 PO1
OR
Q.2 a With suitable diagram and example, explain All-to-All Broadcast and All-to- 7 CO3 PO2
All Reduction
b Explain scatter and gather communication operation with diagram 6 CO3 PO1
c Write a short note on prefix-sum operation. 4 CO3 PO1

Q.3 a Explain performance matrix of parallel systems 7 CO4 PO2

b What is granularity? What are effects of granularity on performance of 6 CO4 PO2
parallel systems

c Explain source of overhead in parallel systems. 4 CO4 PO1

OR
Q.4 a Explain Parallel Matrix-Matrix Multiplication algorithm m with example 7 CO4 PO1
b Write a note on Minimum and cost optimal execution time. 6 CO4 PO1
c What are scalability Characteristics of parallel programs? 4 CO4 PO1
Q.5 a What is CUDA ? Draw and explain CUDA architecture in details 8 CO5 PO1
b Justify Parallel Programming in CUDA-C 6 CO5 PO2
c What is kernel in CUDA ? What is kernel launch? 4 CO5 PO2
OR
Q.6 a Explain how CUDA-C program executes at kernel level with example 8 CO5 PO2
b Describe CUDA communication and synchronization along with CUDA C 6 CO5 PO1
function
c Write a short note on : Managing GUD memory 4 CO5 PO1

Q.7 a What are issues in sorting on parallel computers ? explain with appropriate 8 CO6 PO2
example.
b Explain the terms: 6 CO6 PO1
i. Bitonic Sequence
ii. Bitonic Sort
iii. Bitonic Merge
iv. Bitonic Split with example
c Explain Parallel Depth-First Search with example 4 CO6 PO2

Q.8 a Explain odd even transportation in bubble sort using parallel formulation. 8 CO6 PO2
Give one stepwise example solution using odd-even transportation
b What is kubernets? Explain its features and applications 6 CO6 PO1
c Indicate the sorting issues in parallel computers 4 CO6 PO1

Design of Regenerative Pump
No ratings yet
Design of Regenerative Pump
19 pages
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
100% (1)
Disposal of Unused Drugs - Knowledge and Behavior Among People Around The World
34 pages
HPC Revised Syllabus
No ratings yet
HPC Revised Syllabus
4 pages
Syllabus
No ratings yet
Syllabus
2 pages
High Performance Computing Unit 1-2
No ratings yet
High Performance Computing Unit 1-2
60 pages
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
No ratings yet
ISE-20% Unit Test I-15% Unit Test II-15% ESE-50% (Minimum Passing Marks: 40%)
2 pages
Subject Name Parallel and Distributed Computing
100% (1)
Subject Name Parallel and Distributed Computing
3 pages
CS-3006 - Parallel and Distributed Computing - (BS All Programs) - Spring-2023
No ratings yet
CS-3006 - Parallel and Distributed Computing - (BS All Programs) - Spring-2023
6 pages
Parellel Computing 2024 C - Handout-2
No ratings yet
Parellel Computing 2024 C - Handout-2
3 pages
Parallel Computing LessonPlan
No ratings yet
Parallel Computing LessonPlan
10 pages
PDC Lecture 01
No ratings yet
PDC Lecture 01
36 pages
CSC334 P&DC CDF V4.5
No ratings yet
CSC334 P&DC CDF V4.5
3 pages
COSC 4101 Parallel and Distributed Computing Final
No ratings yet
COSC 4101 Parallel and Distributed Computing Final
4 pages
Parallel and Distributed Computing (CC 510)
No ratings yet
Parallel and Distributed Computing (CC 510)
4 pages
BE Syllabus 2022-23
No ratings yet
BE Syllabus 2022-23
19 pages
Parallel Computing - Major Elective - III
No ratings yet
Parallel Computing - Major Elective - III
3 pages
Faculty of Computer Engineering Informatics and Communications
No ratings yet
Faculty of Computer Engineering Informatics and Communications
5 pages
Parallel and Distributed Computing Handout
100% (3)
Parallel and Distributed Computing Handout
3 pages
PP Cuda Unit1 1
No ratings yet
PP Cuda Unit1 1
77 pages
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
100% (2)
V. Rajaraman, C. Siva Ram Murthy - Parallel Computers Architecture and Programming-PHI (2016)
506 pages
By Cse Syllabus
No ratings yet
By Cse Syllabus
41 pages
Parallel Computing
No ratings yet
Parallel Computing
3 pages
Parallel and Distributed Computing Course Syllabus
No ratings yet
Parallel and Distributed Computing Course Syllabus
3 pages
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
No ratings yet
2-INTRODUCTION TO PDC - MOTIVATION - KEY CONCEPTS-03-Dec-2019Material - I - 03-Dec-2019 - Module - 1 PDF
63 pages
HPC BOOk
No ratings yet
HPC BOOk
68 pages
Parallel and Distributed - Courseoutline
No ratings yet
Parallel and Distributed - Courseoutline
2 pages
Parallel
No ratings yet
Parallel
4 pages
Ece569 Syllabu
No ratings yet
Ece569 Syllabu
3 pages
CSE5006 Multicore-Architectures ETH 1 AC41
No ratings yet
CSE5006 Multicore-Architectures ETH 1 AC41
9 pages
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 03-Aug-2021 Lecture1-Course Introduction
No ratings yet
FALLSEM2021-22 CSE4001 ETH VL2021220104078 Reference Material I 03-Aug-2021 Lecture1-Course Introduction
39 pages
PDC 3
No ratings yet
PDC 3
26 pages
Parallel and Distributed Algorithms
No ratings yet
Parallel and Distributed Algorithms
65 pages
Ca LP
No ratings yet
Ca LP
6 pages
Curriculum Structure Semester - VII: SHIVAJI UNIVERSITY, KOLHAPUR - Syllabus W.E.F. 2014 - 15
No ratings yet
Curriculum Structure Semester - VII: SHIVAJI UNIVERSITY, KOLHAPUR - Syllabus W.E.F. 2014 - 15
35 pages
Mtech-Syllabus-Data Science - Sem2
No ratings yet
Mtech-Syllabus-Data Science - Sem2
18 pages
Pda 1
No ratings yet
Pda 1
72 pages
15-418: Parallel Computer Architecture and Programming Spring 2011 Syllabus
No ratings yet
15-418: Parallel Computer Architecture and Programming Spring 2011 Syllabus
4 pages
Computer Sci. - Technology
No ratings yet
Computer Sci. - Technology
28 pages
GPU Architecture and Programming
No ratings yet
GPU Architecture and Programming
2 pages
CSED405 Lec1-Course Intro - 240903 - 203340
No ratings yet
CSED405 Lec1-Course Intro - 240903 - 203340
65 pages
W1 Intro.4u
No ratings yet
W1 Intro.4u
7 pages
Adv-Comp-Architecture Outline
No ratings yet
Adv-Comp-Architecture Outline
1 page
Cse570 Zola
No ratings yet
Cse570 Zola
11 pages
If3102 High Performance Computing
No ratings yet
If3102 High Performance Computing
2 pages
Ec8552-Cao Unit 5
No ratings yet
Ec8552-Cao Unit 5
72 pages
Mca 4
No ratings yet
Mca 4
61 pages
BE CSE Syllabus 2019-20
No ratings yet
BE CSE Syllabus 2019-20
60 pages
Course Outline
No ratings yet
Course Outline
4 pages
Regulation 2015 ME CSE Syllabus
No ratings yet
Regulation 2015 ME CSE Syllabus
54 pages
Parallel Computing Simply in Depth by Ajit Singh PDF
No ratings yet
Parallel Computing Simply in Depth by Ajit Singh PDF
125 pages
HPC Detailed Notes
No ratings yet
HPC Detailed Notes
5 pages
PDC Complete Course File
No ratings yet
PDC Complete Course File
422 pages
LecturePlan IT201 22ITT-354 PLL AND DIST COMPUTING
No ratings yet
LecturePlan IT201 22ITT-354 PLL AND DIST COMPUTING
13 pages
Advanced Computer Architecture
No ratings yet
Advanced Computer Architecture
2 pages
CS526 1 Intro
No ratings yet
CS526 1 Intro
15 pages
Lecture Week - 1 Introduction 1 - SP-24
No ratings yet
Lecture Week - 1 Introduction 1 - SP-24
51 pages
8 Ths em Comp Science
No ratings yet
8 Ths em Comp Science
29 pages
BE Syllabus 2022-23
No ratings yet
BE Syllabus 2022-23
14 pages
Parallel ProgrammingSyllabus
No ratings yet
Parallel ProgrammingSyllabus
2 pages
SEM 8 Syllabus
No ratings yet
SEM 8 Syllabus
53 pages
Machine Learning Mastery for Engineers
From Everand
Machine Learning Mastery for Engineers
Abdellatif Sadeq
No ratings yet
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
From Everand
Parallel Programming with MPI: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
CPC Modes of Servive Esummon
No ratings yet
CPC Modes of Servive Esummon
12 pages
Safety and Instruction Manual: Meat Grinder
No ratings yet
Safety and Instruction Manual: Meat Grinder
20 pages
ESG Module Handbook 23.24A
No ratings yet
ESG Module Handbook 23.24A
12 pages
2,000 Most Common Italian Words
No ratings yet
2,000 Most Common Italian Words
30 pages
Chapter 10 Strategy Implementation Organizing and Structure
100% (1)
Chapter 10 Strategy Implementation Organizing and Structure
28 pages
Substitute Leadership
100% (1)
Substitute Leadership
1 page
Final Suggestion EC, BC by GKJ
No ratings yet
Final Suggestion EC, BC by GKJ
109 pages
Jigs and Fixtures
No ratings yet
Jigs and Fixtures
5 pages
CB Model Gearbox Rebuild
No ratings yet
CB Model Gearbox Rebuild
7 pages
A Report On An Automated Whistle Blowing System For Aiding Crime Investigation
No ratings yet
A Report On An Automated Whistle Blowing System For Aiding Crime Investigation
68 pages
HI121 - Installation Instructions PDF
No ratings yet
HI121 - Installation Instructions PDF
2 pages
MaterialsTodayProceedings 1
No ratings yet
MaterialsTodayProceedings 1
9 pages
Notice of Recurrence: U.S. Department of Labor
No ratings yet
Notice of Recurrence: U.S. Department of Labor
4 pages
Nav Report
No ratings yet
Nav Report
3 pages
80407049830 (5)
No ratings yet
80407049830 (5)
2 pages
Peta1 Q1
No ratings yet
Peta1 Q1
2 pages
BS-08 Partitionof Bengal
No ratings yet
BS-08 Partitionof Bengal
23 pages
Ajsr 50 08
No ratings yet
Ajsr 50 08
14 pages
Assessment 613 Full Resubmission PDF
No ratings yet
Assessment 613 Full Resubmission PDF
32 pages
Cs 201 Long Quiz 2
No ratings yet
Cs 201 Long Quiz 2
3 pages
Library Jit Final Handout
No ratings yet
Library Jit Final Handout
49 pages
MIP GET VIEW BOQDripSystem
No ratings yet
MIP GET VIEW BOQDripSystem
6 pages
FastLink CAT5e (SFTP) Outdoor
No ratings yet
FastLink CAT5e (SFTP) Outdoor
3 pages
Garments Label
No ratings yet
Garments Label
7 pages
Cho em gần anh thêm chút nữa Sheet music for Piano (Solo)
No ratings yet
Cho em gần anh thêm chút nữa Sheet music for Piano (Solo)
1 page
JSR-211 - Devx
No ratings yet
JSR-211 - Devx
6 pages
Open The Dor
No ratings yet
Open The Dor
9 pages
Pol Party Raz
No ratings yet
Pol Party Raz
1 page

Handbook HPC 23-24

Uploaded by

Handbook HPC 23-24

Uploaded by

Savitribai Phule Pune University

Fourth Year of Computer Engineering (2019 Course)

Prerequisites Courses: -Microprocessor (210254), Principles of Programming

Unit II :Parallel Algorithm Design 07 Hours

Unit III : Parallel Communication 07 Hours

Unit IV :Analytical Modeling of Parallel Programs 07 Hours

Unit V : CUDA Architecture 07 Hours

Unit VI : High Performance Computing Applications 07 Hours

NPTEL/YouTube video lecture link

RMD SINHGAD SCHOOL OF ENGINEERING, PUNE

Subject Mrs.P.V.Kasture Previous 3 Years University 2020-21 2021-22 2022-23

UNIT – I: Introduction to parallel computing (07 Hours )

Data Flow Models, Demand-

Contents Beyond syllabus

UNIT –II: Parallel Algorithm Design (07 Hours )

10. Characteristics of Tasks and

Contents Beyond syllabus

UNIT –III: Parallel Communication 07 Hours

Circular Shift, Improving

Contents Beyond syllabus

Contents Beyond syllabus

UNIT –V: CUDA Architecture (07 Hours)

Contents Beyond syllabus

UNIT –VI: High Performance Computing Applications (07 Hours)

38. Breadth First Search( BFS)

40. Parallel Sorting: Merge Sort

Contents Beyond syllabus

Date: Name and Sign of Subject Teacher Head of Department

Total no. of lectures conducted

Percentage of syllabus covered

Total no. of makeup classes 00

Date: Name and Sign of Subject Teacher Head of Department

Course outcomes from SPPU syllabus

CO1 To understand different parallel programming models

CO2 To analyze the performance and modeling of parallel programs

CO3 To illustrate the various techniques to parallelize the algorithm

CO4 To implement parallel communication operations.

CO5 To discriminate CUDA Architecture and its components.

CO6 To Understand Scope of Parallel Computing and its search algorithms.

PO of Computer Engineering Department

Unit Test I 2023-24

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

c Explain following architecture in detail, 4 CO3 PO1

b Explain Communication Model of Parallel Platform 5 CO3 PO1

Q.3 a Explain Principles of Parallel Algorithm Design 6 CO4 PO1

Unit Test I 2023-24

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

c Explain multi-core processor architecture. 4 CO3 PO1

Q.3 a Explain recursive decomposition with suitable example 6 CO4 PO1

Unit Test II 2023-24

Subject- High Performance Computing

Duration -Time: 1 Hour. Maximum Marks -30

c Explain the concept of Scatter and Gather. 4 CO3 PO1

Q.3 a Explain various sources of overhead in parallel systems? 6 CO4 PO1

Unit Test II 2023-24

Subject- High Performance Computing

Duration -Time: 1 Hour . Maximum Marks -30

b Explain all-to-all personalized communication and its applications 5 CO3 PO1

Preliminary Examination 2023-2024

Subject- High Performance Computing

Duration -Time : 2½ Hours . Maximum Marks -70

Q.3 a Explain various sources of overhead in parallel systems? 7 CO4 PO1

Preliminary Examination 2023-24

Subject- High Performance Computing

Duration -Time : 2½ Hours . Maximum Marks -70

Q.3 a Explain performance matrix of parallel systems 7 CO4 PO2

c Explain source of overhead in parallel systems. 4 CO4 PO1

You might also like