0% found this document useful (0 votes)

98 views51 pages

Lecture Week - 1 Introduction 1 - SP-24

Uploaded by

imhafeezniazi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

98 views51 pages

Lecture Week - 1 Introduction 1 - SP-24

Uploaded by

imhafeezniazi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Parallel & Distributed Computing

Lecture NO: 01
Introduction

Farhad M. Riaz
[email protected]

Department of Computer Science

NUML, Islamabad
Course Pre-requisites

 Programming Experience (preferably

Python/C++/Java)
 Understanding of Computer Organization
and Architecture
 Understanding of Operating System
Requirements & Grading

 Roughly
– 50 % Final Exam
– 25% Internal Evaluation
 Quiz 8 Marks
 Assignments 8 Marks
 Project 9 Marks
– 25% Mid term exam
Books

 Some good books are:

– Distributed Systems Third edition
– PRINCIPLES OF PARALLEL PROGRAMMING
– Designing and Building Parallel Programs
– Distributed and Cloud Computing
Course Project

 At the end of the semester students needs to

submit a semester project like
– Distributed computing & smart city services
– Large scale convolutional neural networks
– Distributed computing with delay tolerant network
Course Overview

 This course covers following main concepts

– Concepts of parallel and distributed computing
– Analysis and profiling of applications
– Shared memory concepts
– Distributed memory concepts
– Parallel and distributed programming (OpenMP, MPI)
– GPU based computing and programming (CUDA)
– Virtualization
– Cloud Computing, MapReduce
– Grid Computing
– Peer-to-Peer Computing
– Future trends in computing
Recommended Material
 Distributed Systems, Maarten van Steen & Andrew S. Tanenbaum, 3rd
Edition (2020), Pearson.
 Parallel Programming: Concepts and Practice, Bertil Schmidt, Jorge
Gonzalez-Dominguez, Christian Hundt, Moritz Schlarb, 1st Edition
(2018), Elsevier.
 Parallel and High-Performance Computing, Robert Robey and Yuliana
Zamora, 1st Edition (2021).
 Distributed and Cloud Computing: From Parallel Processing to the
Internet of Things, Kai Hwang, Jack Dongarra, Geoffrey Fox, 1st
Edition (2012), Elsevier.
 Multicore and GPU Programming: An Integrated Approach,
Gerassimos Barlas, 2nd Edition (2015), Elsevier.
 Parallel programming: For multicore and cluster systems. Rauber,
Thomas, and Gudula Rünger. Springer Science & Business Media,
2013.
Recent Jobs
Jobs
Research In Parallel & Distributed
Computing
Single Processor Architecture
Memory Hierarchy
5 years of Technology Advance
Productivity Gap
Pipelining
Pipelining
Multicore Trend
Application Partitioning
High-Performance Computing
(HPC)

 HPC is the use of parallel processing for running

advanced application programs efficiently, reliably
and quickly.
 It applies especially to systems that function above a
tera FLOPs (floating-point operations per second)
processing speed.
 The term HPC is occasionally used as a synonym for
supercomputing, although technically a
supercomputer is a system that performs at or near
the currently highest operational rate for computers.
High Performance Computing
GPU-accelerated Computing

 GPU-accelerated computing is the use of a graphics

processing unit (GPU) together with a CPU to
accelerate deep learning, analytics, and engineering
applications.
 Pioneered in 2007 by NVIDIA, GPU accelerators now
power energy-efficient data centers in government labs,
universities, enterprises, and small-and-medium
businesses around the world.
 They play a huge role in accelerating applications in
platforms ranging from artificial intelligence to cars,
drones, and robots.
What is GPU?

 It is a processor optimized for 2D/3D graphics, video,

visual computing, and display.
 It is highly parallel, highly multithreaded
multiprocessor optimized for visual computing.
 It provide real-time visual interaction with computed
objects via graphics images, and video.
 It serves as both a programmable graphics
processor and a scalable parallel computing
platform.
 Heterogeneous Systems: combine a GPU with a
CPU
SGI Altix Supercomputer 2300 processors
HPC System composition
Parallel Computers

 Virtually all stand-alone computers

today are parallel from hardware
perspective:
– Multiple functional units (L1 cache,
L2 cache, branch, pre-fetch,
decode, floating-point, graphics
processing (GPU), integer, etc.)
– Multiple execution units/cores
– Multiple hardware threads

IBM BG/Q Compute Chip with 18 cores (PU) and 16 L2 Cache units (L2)
Parallel Computers
 Networks connect multiple
stand-alone computers
(nodes) to make larger
parallel computer clusters.
 Parallel computer cluster
– Each compute node is a
multi-processor parallel
computer in itself
– Multiple compute nodes are
networked together with an
Infiniband network
– Special purpose nodes, also
multi-processor, are used for
other purposes
Types of Parallel and Distributed
Computing

 Parallel Computing
– Shared Memory
– Distributed Memory

 Distributed Computing
– Cluster Computing
– Grid Computing
– Cloud Computing
– Distributed Pervasive Systems
Parallel Computing
Distributed (Cluster) Computing

 Essentially a group of high-end

systems connected through a
LAN
 Homogeneous: same OS, near-
identical hardware
 Single managing node
Distributed (Grid) Computing

 Lots of nodes from everywhere

– Heterogeneous
– Dispersed across several organizations
– Can easily span a wide-area network

 To allow for collaborations, grids generally use virtual

organizations.
 In essence, this is a grouping of users (or their IDs) that will
allow for authorization on resource allocation.
Distributed (Cloud) Computing
Distributed (Pervasive) Computing

 Emerging next-generation of distributed systems in which

nodes are small, mobile, and often embedded in a larger
system, characterized by the fact that the system naturally
blends into the user’s environment.
 Three subtypes
– Ubiquitous computing systems: pervasive and
continuously present, i.e., there is a continuous
interaction between system and user.
– Mobile computing systems: pervasive, but emphasis is
on the fact that devices are inherently mobile.
– Sensor (and actuator) networks: pervasive, with
emphasis on the actual (collaborative) sensing and
actuation of the environment.
Why Use Parallel Computing?
The Real World is Massively
Parallel
 In the natural world, many
complex, interrelated events
are happening at the same
time, yet within a temporal
sequence.
 Compared to serial computing,
parallel computing is much
better suited for modeling,
simulating and understanding
complex, real world
phenomena.
 For example, imagine modeling
these serially =>
SAVE TIME AND/OR MONEY
(Main Reasons)

 In theory, throwing
more resources at a
task will shorten its
time to completion,
with potential cost
savings.
 Parallel computers
can be built from
cheap, commodity
components.
SOLVE LARGER / MORE COMPLEX
PROBLEMS (Main Reasons)
 Many problems are so large and/or complex
that it is impractical or impossible to solve
them on a single computer, especially given
limited computer memory.
 Example: Web search engines/databases
processing millions of transactions every
second
PROVIDE CONCURRENCY
(Main Reasons)

 A single compute resource can only do one

thing at a time. Multiple compute resources
can do many things simultaneously.
 Example: Collaborative Networks provide a
global venue where people from around the
world can meet and conduct work "virtually".
MAKE BETTER USE OF UNDERLYING
PARALLEL HARDWARE
(Main Reasons)

 Modern computers, even

laptops, are parallel in
architecture with multiple
processors/cores.
 Parallel software is
specifically intended for
parallel hardware with
multiple cores, threads, etc.
 In most cases, serial
programs run on modern
computers "waste" potential
computing power.

Intel Xeon processor with 6 cores and 6

L3 cache units
The Future
(Main Reasons)
 During the past 20+ years, the trends
indicated by ever faster networks,
distributed systems, and multi-
processor computer architectures
(even at the desktop level) clearly
show that parallelism is the future of
computing.
 In this same time period, there has
been a greater than 500,000x
increase in supercomputer
performance, with no end currently in
sight.
 The race is already on for Exascale
Computing!
 Exaflop = 1018 calculations per
second
That’s all for today!!

UCF High Performance Computing Overview
No ratings yet
UCF High Performance Computing Overview
19 pages
Microsoft in High Performance Computing: An Introduction: Aditya Krishnan Technical Product Manager Microsoft Corp
No ratings yet
Microsoft in High Performance Computing: An Introduction: Aditya Krishnan Technical Product Manager Microsoft Corp
21 pages
HPC - Unit Test-I (9 July 2020) : Mark Only One Oval
No ratings yet
HPC - Unit Test-I (9 July 2020) : Mark Only One Oval
5 pages
CC 1 Unit Notes
No ratings yet
CC 1 Unit Notes
8 pages
HPC Impact on Cloud & GPU Computing
No ratings yet
HPC Impact on Cloud & GPU Computing
17 pages
Nvidia Opencl Best Practices Guide: Optimization
No ratings yet
Nvidia Opencl Best Practices Guide: Optimization
49 pages
HPC Node Performance Simulation
No ratings yet
HPC Node Performance Simulation
33 pages
NGC Registry Launch Technical Overview
No ratings yet
NGC Registry Launch Technical Overview
11 pages
DGX Solution Stack Whitepaper
No ratings yet
DGX Solution Stack Whitepaper
24 pages
2021-02-04 DAIM Company Presentation
No ratings yet
2021-02-04 DAIM Company Presentation
17 pages
Cap 100M PDF
No ratings yet
Cap 100M PDF
35 pages
Uncertainty in Modeling
No ratings yet
Uncertainty in Modeling
25 pages
Tesla V100 Performance Guide
No ratings yet
Tesla V100 Performance Guide
23 pages
Dgx1 v100 System Architecture Whitepaper
No ratings yet
Dgx1 v100 System Architecture Whitepaper
43 pages
Using FFmpeg With NVIDIA GPU Hardware Acceleration
No ratings yet
Using FFmpeg With NVIDIA GPU Hardware Acceleration
22 pages
High Performance Computing Lecture 2 Parallel Programming With MPI Pub
No ratings yet
High Performance Computing Lecture 2 Parallel Programming With MPI Pub
50 pages
Accelerating Matrix Multiplication With Block Sparse Format and NVIDIA Tensor Cores - NVIDIA Technical Blog
No ratings yet
Accelerating Matrix Multiplication With Block Sparse Format and NVIDIA Tensor Cores - NVIDIA Technical Blog
7 pages
344.48 Nvidia Control Panel Quick Start Guide PDF
No ratings yet
344.48 Nvidia Control Panel Quick Start Guide PDF
33 pages
Nvidia RTX A2000 Datasheet
No ratings yet
Nvidia RTX A2000 Datasheet
1 page
362.00 Nvidia Control Panel Quick Start Guide
No ratings yet
362.00 Nvidia Control Panel Quick Start Guide
33 pages
ACER Altos R3 Server Datasheet
No ratings yet
ACER Altos R3 Server Datasheet
2 pages
TB 04631 001 - v01
No ratings yet
TB 04631 001 - v01
25 pages
Nvidia XID - Errors
No ratings yet
Nvidia XID - Errors
12 pages
dgx2 User Guide
No ratings yet
dgx2 User Guide
125 pages
Nvidia DGX Station Print Infographic 738375 Web
No ratings yet
Nvidia DGX Station Print Infographic 738375 Web
1 page
Triton X-100-1
No ratings yet
Triton X-100-1
9 pages
High Performance Network-on-Chip Through MPLS
No ratings yet
High Performance Network-on-Chip Through MPLS
4 pages
OpenAI SOC 3 Report
No ratings yet
OpenAI SOC 3 Report
12 pages
HPC Datasheet sc23 h200 Datasheet 3002446
No ratings yet
HPC Datasheet sc23 h200 Datasheet 3002446
3 pages
High Performance Computing: What Is It Used For and Why?
No ratings yet
High Performance Computing: What Is It Used For and Why?
19 pages
Using Ffmpeg With Nvidia Gpu Hardware Acceleration: Application Note
No ratings yet
Using Ffmpeg With Nvidia Gpu Hardware Acceleration: Application Note
20 pages
Speaker - A02 - 5747 - Best Practices in Networking For AI
No ratings yet
Speaker - A02 - 5747 - Best Practices in Networking For AI
15 pages
ResQ Redis Cache ER Diagram
No ratings yet
ResQ Redis Cache ER Diagram
1 page
BVRIT CHAT: Advanced Messaging App
No ratings yet
BVRIT CHAT: Advanced Messaging App
16 pages
Nvidia Presentation
No ratings yet
Nvidia Presentation
6 pages
GPFS and HDFS
No ratings yet
GPFS and HDFS
5 pages
GB922 Addendum 5LR Logical Resource R9 0 V9-3
No ratings yet
GB922 Addendum 5LR Logical Resource R9 0 V9-3
351 pages
DS Mod4
No ratings yet
DS Mod4
32 pages
High Performance Computer Networks HPCN - Engineering Science
No ratings yet
High Performance Computer Networks HPCN - Engineering Science
6 pages
Module 2 Class 1
No ratings yet
Module 2 Class 1
9 pages
CUDA Installation Guide Windows
100% (1)
CUDA Installation Guide Windows
17 pages
Software Life-Cycle Management: Openup and Architecture Handbook Overview
No ratings yet
Software Life-Cycle Management: Openup and Architecture Handbook Overview
58 pages
Sol Arc
No ratings yet
Sol Arc
26 pages
Zoom System Design PDF
No ratings yet
Zoom System Design PDF
9 pages
APN Partner Project Plan Template
No ratings yet
APN Partner Project Plan Template
8 pages
Introduction To High Performance Scientific Computing
No ratings yet
Introduction To High Performance Scientific Computing
464 pages
Nvidia Story, PDF (1) (2) - 1
No ratings yet
Nvidia Story, PDF (1) (2) - 1
38 pages
Python & Data Analytics Expert
No ratings yet
Python & Data Analytics Expert
5 pages
Divy HPC
No ratings yet
Divy HPC
36 pages
UNIT III Design and Architecture Part II
No ratings yet
UNIT III Design and Architecture Part II
94 pages
Nvidia DGX Os 4 Server: Software Release Notes
No ratings yet
Nvidia DGX Os 4 Server: Software Release Notes
19 pages
Insert Project Title: Business Requirements Specification
No ratings yet
Insert Project Title: Business Requirements Specification
19 pages
The Right Tools For Professionals: Nvidia Workstation Gpus
No ratings yet
The Right Tools For Professionals: Nvidia Workstation Gpus
4 pages
Nvidia-Learning-Training Course-Catalog
No ratings yet
Nvidia-Learning-Training Course-Catalog
27 pages
Deloitte Take Home Challenge - V2
No ratings yet
Deloitte Take Home Challenge - V2
83 pages
Deep Learning With Databricks: Srijith Rajamohan, Ph.D. John O'Dwyer
No ratings yet
Deep Learning With Databricks: Srijith Rajamohan, Ph.D. John O'Dwyer
38 pages
Slurm ParallelCluster AWS
No ratings yet
Slurm ParallelCluster AWS
3 pages
Security 101 - Presentation
0% (1)
Security 101 - Presentation
85 pages
DGX A100 System Architecture Whitepaper
No ratings yet
DGX A100 System Architecture Whitepaper
23 pages
Lecture 1 Introduction
No ratings yet
Lecture 1 Introduction
34 pages
1.1 BOQ - Requirements - From - NAPD - To - Establish - A - Digital - Recording - Studio - Update-19-03-2023
No ratings yet
1.1 BOQ - Requirements - From - NAPD - To - Establish - A - Digital - Recording - Studio - Update-19-03-2023
15 pages
Unix - Quick Guide Getting Started
No ratings yet
Unix - Quick Guide Getting Started
9 pages
p00917 Datasheet 5151ecc832199c9
No ratings yet
p00917 Datasheet 5151ecc832199c9
1 page
Build CAN Communication Simulink Models - MATLAB & Simulink
No ratings yet
Build CAN Communication Simulink Models - MATLAB & Simulink
9 pages
Classifier Administration PowerShell Getting Started Guide
No ratings yet
Classifier Administration PowerShell Getting Started Guide
15 pages
MATLAB Windows 2018a System Requirement
No ratings yet
MATLAB Windows 2018a System Requirement
1 page
Raspberry PI Project
No ratings yet
Raspberry PI Project
16 pages
Lenovo AIO 3 - 22ITL6 LA-K881P Rev 1.A DGPU Schematic
No ratings yet
Lenovo AIO 3 - 22ITL6 LA-K881P Rev 1.A DGPU Schematic
28 pages
E-Freelancing System Report
No ratings yet
E-Freelancing System Report
27 pages
Digital System Functional Units
No ratings yet
Digital System Functional Units
20 pages
MSI CEP Lab Project Report Final Version Report...
No ratings yet
MSI CEP Lab Project Report Final Version Report...
23 pages
Dgraph: Distributed Graph Database Overview
No ratings yet
Dgraph: Distributed Graph Database Overview
11 pages
Python String Methods Guide
No ratings yet
Python String Methods Guide
5 pages
Transformation Starts: A Next-Generation Network Management System To Run Your Business at A Higher Level
No ratings yet
Transformation Starts: A Next-Generation Network Management System To Run Your Business at A Higher Level
13 pages
0
No ratings yet
0
3 pages
Cloud Architect Internship Resume
No ratings yet
Cloud Architect Internship Resume
3 pages
Abhinav 98
No ratings yet
Abhinav 98
10 pages
Series 420 Non-Contact Torque Transducers
No ratings yet
Series 420 Non-Contact Torque Transducers
5 pages
Packard Bell Disassembly Manual: Easynote MT
No ratings yet
Packard Bell Disassembly Manual: Easynote MT
28 pages
Java 8 Functional Programming Guide
No ratings yet
Java 8 Functional Programming Guide
16 pages
Exercise 1 (DSA)
No ratings yet
Exercise 1 (DSA)
16 pages
Review Questions: Multiple Choice
No ratings yet
Review Questions: Multiple Choice
8 pages
Frank Guocongmit@gmail
No ratings yet
Frank Guocongmit@gmail
3 pages
Lo3 Connect Hardware Peripherals
No ratings yet
Lo3 Connect Hardware Peripherals
8 pages
Open Maqueen Plus V2
No ratings yet
Open Maqueen Plus V2
84 pages
Quiz
No ratings yet
Quiz
16 pages
Emerging Programming Platform - 1714377981374
No ratings yet
Emerging Programming Platform - 1714377981374
7 pages
Microsoft 365 Licensing Plans Comparison
No ratings yet
Microsoft 365 Licensing Plans Comparison
9 pages
Embedded Systems & IoT Guide
100% (2)
Embedded Systems & IoT Guide
214 pages
ST-909 User Manual 242
No ratings yet
ST-909 User Manual 242
12 pages

Lecture Week - 1 Introduction 1 - SP-24

Uploaded by

Lecture Week - 1 Introduction 1 - SP-24

Uploaded by

Parallel & Distributed Computing

Department of Computer Science

 Programming Experience (preferably

 Some good books are:

 At the end of the semester students needs to

 This course covers following main concepts

 HPC is the use of parallel processing for running

 GPU-accelerated computing is the use of a graphics

 It is a processor optimized for 2D/3D graphics, video,

 Virtually all stand-alone computers

 Essentially a group of high-end

 Lots of nodes from everywhere

 To allow for collaborations, grids generally use virtual

 Emerging next-generation of distributed systems in which

 A single compute resource can only do one

 Modern computers, even

Intel Xeon processor with 6 cores and 6

You might also like