Introduction To Parallel Computing CIS 410/510 Department of Computer and Information Science
Introduction To Parallel Computing CIS 410/510 Department of Computer and Information Science
Lecture 1 – Overview
Outline
Course Overview
❍ What is CIS 410/510?
❍ What is expected of you?
Parallel Computing
❍ What is it?
❍ What motivates it?
Michael McCool
❍ Software architect
❍ Former Chief scientist, RapidMind
Arch Robison
❍ Architect of Threading Building Blocks
❍ Former lead developers of KAI C++
David MacKay
❍ Manager of software product consulting team
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 5
CIS 410/510 Graduate Assistants
Daniel Ellsworth
❍ 3rd year Ph.D. student
❍ Research advisor (Prof. Malony)
David Poliakoff
❍ 2nd year Ph.D. student
❍ Research advisor (Prof. Malony)
Brandon Hildreth
❍ 1st year Ph.D. student
❍ Research advisor (Prof. Malony)
❍ Algorithms (2 weeks)
❍ Tools (1 week)
❍ Applications (1 week)
No final exam
❍ Team project presentations during final period
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 16
Parallel Programming Term Project
Major programming project for the course
❍ Non-trivial parallel application
❍ Include performance analysis
Project teams
❍ 5 person teams, 6 teams (depending on enrollment)
❍ Will try our best to balance skills
Project dates
❍ Proposal due end of 4th week)
❍ Project talk during last class
Topics of study
❍ Parallel architectures Parallel performance
❍ Parallel programming models and tools
❍ Parallel algorithms Parallel applications
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 20
What will you get out of CIS 410/510?
• In-depth understanding of parallel computer design
• Knowledge of how to program parallel computer
systems
• Understanding of pattern-based parallel
programming
• Exposure to different forms parallel algorithms
• Practical experience using a parallel cluster
• Background on parallel performance modeling
• Techniques for empirical performance analysis
• Fun and new friends
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 21
Parallel Processing – What is it?
• A parallel computer is a computer system that uses
multiple processing elements simultaneously in a
cooperative manner to solve a computational problem
• Parallel processing includes techniques and
technologies that make it possible to compute in parallel
– Hardware, networks, operating systems, parallel libraries,
languages, compilers, algorithms, tools, …
• Parallel computing is an evolution of serial computing
– Parallelism is natural
– Computing problems differ in level / type of parallelism
• Parallelism is all about performance! Really?
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 22
Concurrency
• Consider multiple tasks to be executed in a computer
• Tasks are concurrent with respect to each if
– They can execute at the same time (concurrent execution)
– Implies that there are no dependencies between the tasks
• Dependencies
– If a task requires results produced by other tasks in order to
execute correctly, the task’s execution is dependent
– If two tasks are dependent, they are not concurrent
– Some form of synchronization must be used to enforce (satisfy)
dependencies
• Concurrency is fundamental to computer science
– Operating systems, databases, networking, …
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 23
Concurrency and Parallelism
• Concurrent is not the same as parallel! Why?
• Parallel execution
– Concurrent tasks actually execute at the same time
– Multiple (processing) resources have to be available
• Parallelism = concurrency + “parallel” hardware
– Both are required
– Find concurrent execution opportunities
– Develop application to execute in parallel
– Run application on parallel hardware
• Is a parallel application a concurrent application?
• Is a parallel application run with one processor parallel?
Why or why not?
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 24
Parallelism
• There are granularities of parallelism (parallel execution) in
programs
– Processes, threads, routines, statements, instructions, …
– Think about what are the software elements that execute
concurrently
• These must be supported by hardware resources
– Processors, cores, … (execution of instructions)
– Memory, DMA, networks, … (other associated operations)
– All aspects of computer architecture offer opportunities for parallel
hardware execution
• Concurrency is a necessary condition for parallelism
– Where can you find concurrency?
– How is concurrency expressed to exploit parallel systems?
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 25
Why use parallel processing?
• Two primary reasons (both performance related)
– Faster time to solution (response time)
– Solve bigger computing problems (in same time)
• Other factors motivate parallel processing
– Effective use of machine resources
– Cost efficiencies
– Overcoming memory constraints
• Serial machines have inherent limitations
– Processor speed, memory bottlenecks, …
• Parallelism has become the future of computing
• Performance is still the driving concern
• Parallelism = concurrency + parallel HW + performance
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 26
Perspectives on Parallel Processing
• Parallel computer architecture
– Hardware needed for parallel execution?
– Computer system design
• (Parallel) Operating system
– How to manage systems aspects in a parallel computer
• Parallel programming
– Libraries (low-level, high-level)
– Languages
– Software development environments
• Parallel algorithms
• Parallel performance evaluation
• Parallel tools
– Performance, analytics, visualization, …
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 27
Why study parallel computing today?
• Computing architecture
– Innovations often drive to novel programming models
• Technological convergence
– The “killer micro” is ubiquitous
– Laptops and supercomputers are fundamentally similar!
– Trends cause diverse approaches to converge
• Technological trends make parallel computing inevitable
– Multi-core processors are here to stay!
– Practically every computing system is operating in parallel
• Understand fundamental principles and design tradeoffs
– Programming, systems support, communication, memory, …
– Performance
• Parallelism is the future of computing
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 28
Inevitability of Parallel Computing
• Application demands
– Insatiable need for computing cycles
• Technology trends
– Processor and memory
• Architecture trends
• Economics
• Current trends:
– Today’s microprocessors have multiprocessor support
– Servers and workstations available as multiprocessors
– Tomorrow’s microprocessors are multiprocessors
– Multi-core is here to stay and #cores/processor is growing
– Accelerators (GPUs, gaming systems)
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 29
Application Characteristics
• Application performance demands hardware advances
• Hardware advances generate new applications
• New applications have greater performance demands
– Exponential increase in microprocessor performance
– Innovations in parallel architecture and integration
performance
applications
hardware
• Range of performance requirements
– System performance must also improve as a whole
– Performance requirements require computer engineering
– Costs addressed through technology advancements
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 30
Broad Parallel Architecture Issues
• Resource allocation
– How many processing elements?
– How powerful are the elements?
– How much memory?
• Data access, communication, and synchronization
– How do the elements cooperate and communicate?
– How are data transmitted between processors?
– What are the abstractions and primitives for cooperation?
• Performance and scalability
– How does it all translate into performance?
– How does it scale?
Introduction to Parallel Computing, University of Oregon, IPCC Lecture 1 – Overview 31
Leveraging Moore’s Law
power wall
Operations Nodes
❍ 4096 FT CPUs
Proprietary interconnect
❍ TH2 express
1PB memory
❍ Host memory only
Global shared parallel storage is 12.4 PB
Cabinets: 125+13+24 =162
❍ Compute, communication, storage
❍ ~750 m2
❍ 32 + 6 GB memory
5-dimensional torus
interconnection
network
Area of 3,000 ft2
80,000 CPUs
❍ SPARC64 VIIIfx
❍ 640,000 cores
Mitsuhisa Sato
Barbara Chapman
Matthias Müller
???
CIS 410/510: Parallel Computing, University of Oregon, Spring 2014 Lecture 1 – Overview 65