0% found this document useful (0 votes)

25 views45 pages

CH17 COA10e

Uploaded by

amol sonu hemade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

25 views45 pages

CH17 COA10e

Uploaded by

amol sonu hemade

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 45

+

William Stallings
Computer Organization
and Architecture
10th Edition

© 2016 Pearson Education, Inc., Hoboken,

NJ. All rights reserved.
+ Chapter 17
Parallel Processing
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Multiple Processor Organization

 Single instruction, single data  Multiple instruction, single

(SISD) stream data (MISD) stream
 Single processor executes a  A sequence of data is
single instruction stream to transmitted to a set of
operate on data stored in a processors, each of which
single memory executes a different instruction
 Uniprocessors fall into this sequence
category  Not commercially implemented

 Single instruction, multiple data  Multiple instruction, multiple

(SIMD) stream data (MIMD) stream
 A single machine instruction  A set of processors
controls the simultaneous simultaneously execute different
execution of a number of instruction sequences on
processing elements on a different data sets
lockstep basis  SMPs, clusters and NUMA
 Vector and array processors fall systems fit this category
into this category

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Symmetric Multiprocessor
(SMP)
A stand alone computer
with the following
characteristics:
Processors All System
share same processors controlled by
memory and share access All integrated
I/O facilities to I/O processors operating
Two or more • Processors are devices
can perform system
similar connected by a • Either through • Provides
the same
processors of bus or other same channels interaction
internal or different functions between
comparable connection channels giving (hence processors and
capacity • Memory access paths to same “symmetric” their programs
time is devices at job, task, file
approximately ) and data
the same for element levels
each processor

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
The bus organization has several
attractive features:

 Simplicity
 Simplest approach to multiprocessor organization

 Flexibility
 Generally easy to expand the system by attaching more
processors to the bus

 Reliability
 The bus is essentially a passive medium and the failure of
any attached device should not cause failure of the whole
system

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Disadvantages of the bus organization:

 Main drawback is performance

 All memory references pass through the common bus
 Performance is limited by bus cycle time

 Each processor should have cache memory

 Reduces the number of bus accesses

 Leads to problems with cache coherence

 If a word is altered in one cache it could conceivably
invalidate a word in another cache
 To prevent this the other processors must be alerted that
an update has taken place
 Typically addressed in hardware rather than the operating
system

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Multiprocessor Operating
System Design Considerations
 Simultaneous concurrent processes
 OS routines need to be reentrant to allow several processors to execute the same IS code simultaneously
 OS tables and management structures must be managed properly to avoid deadlock or invalid operations

 Scheduling
 Any processor may perform scheduling so conflicts must be avoided
 Scheduler must assign ready processes to available processors

 Synchronization
 With multiple active processes having potential access to shared address spaces or I/O resources, care
must be taken to provide effective synchronization
 Synchronization is a facility that enforces mutual exclusion and event ordering

 Memory management
 In addition to dealing with all of the issues found on uniprocessor machines, the OS needs to exploit the
available hardware parallelism to achieve the best performance
 Paging mechanisms on different processors must be coordinated to enforce consistency when several
processors share a page or segment and to decide on page replacement

 Reliability and fault tolerance

 OS should provide graceful degradation in the face of processor failure
 Scheduler and other portions of the operating system must recognize the loss of a processor and
restructure accordingly
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Cache Coherence
Software Solutions

 Attempt to avoid the need for additional hardware

circuitry and logic by relying on the compiler and
operating system to deal with the problem
 Attractive because the overhead of detecting potential
problems is transferred from run time to compile time,
and the design complexity is transferred from hardware
to software
 However, compile-time software approaches generally must
make conservative decisions, leading to inefficient cache
utilization

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Cache Coherence
Hardware-Based Solutions
 Generally referred to as cache coherence protocols
 These solutions provide dynamic recognition at run time of
potential inconsistency conditions
 Because the problem is only dealt with when it actually
arises there is more effective use of caches, leading to
improved performance over a software approach
 Approaches are transparent to the programmer and the
compiler, reducing the software development burden
 Can be divided into two categories:
 Directory protocols
 Snoopy protocols

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Directory Protocols
Collect and Effective in large
maintain scale systems with
information about complex
copies of data in interconnection
cache schemes

Directory stored in Creates central

main memory bottleneck

Requests are Appropriate

checked against transfers are
directory performed

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Snoopy Protocols
 Distribute the responsibility for maintaining cache coherence
among all of the cache controllers in a multiprocessor
 A cache must recognize when a line that it holds is shared with
other caches
 When updates are performed on a shared cache line, it must be
announced to other caches by a broadcast mechanism
 Each cache controller is able to “snoop” on the network to observe
these broadcast notifications and react accordingly

 Suited to bus-based multiprocessor because the shared bus

provides a simple means for broadcasting and snooping
 Care must be taken that the increased bus traffic required for
broadcasting and snooping does not cancel out the gains from the
use of local caches

 Two basic approaches have been explored:

 Write invalidate
 Write update (or write broadcast)
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Write Invalidate

 Multiple readers, but only one writer at a time

 When a write is required, all other caches of the line
are invalidated
 Writing processor then has exclusive (cheap) access
until line is required by another processor
 Most widely used in commercial multiprocessor
systems such as the x86 architecture
 State of every line is marked as modified, exclusive,
shared or invalid
 For this reason the write-invalidate protocol is called MESI

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
Write Update

Can be multiple readers and writers

When a processor wishes to update a shared

line the word to be updated is distributed to
all others and caches containing that line can
update it
Some systems use an adaptive mixture of
both write-invalidate and write-update
mechanisms

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+
MESI Protocol
To provide cache consistency on an SMP the data
cache supports a protocol known as MESI:

 Modified
 The line in the cache has been modified and is available
only in this cache

 Exclusive
 The line in the cache is the same as that in main memory
and is not present in any other cache

 Shared
 The line in the cache is the same as that in main memory
and may be present in another cache

 Invalid
 The line in the cache does not contain valid data
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Table 17.1
MESI Cache Line States

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Multithreading and Chip
Multiprocessors
 Processor performance can be measured by the rate at
which it executes instructions
 MIPS rate = f * IPC
 f = processor clock frequency, in MHz
 IPC = average instructions per cycle

 Increase performance by increasing clock frequency and

increasing instructions that complete during cycle
 Multithreading
 Allows for a high degree of instruction-level parallelism
without increasing circuit complexity or power consumption
 Instruction stream is divided into several smaller streams,
known as threads, that can be executed in parallel
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Definitions of Threads
and Processes Thread in multithreaded
processors may or may not
be the same as the concept
of software threads in a
multiprogrammed operating
system

Thread is concerned with

Thread switch scheduling and execution,
• The act of switching processor
whereas a process is
control between threads within the concerned with both
same process scheduling/execution and
• Typically less costly than process resource and resource
switch
ownership

Thread:
• Dispatchable unit of work within a
process
Process:
• Includes processor context (which • An instance of program running on
includes the program counter and computer
stack pointer) and data area for • Two key characteristics:
stack • Resource ownership
• Executes sequentially and is • Scheduling/execution
interruptible so that the processor
can turn to another thread

Process switch
• Operation that switches the
processor from one process to
another by saving all the process
control data, registers, and other
information for the first and replacing
them with the process information for
the second
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Implicit and Explicit
Multithreading
 All commercial processors and most
experimental ones use explicit
multithreading
 Concurrently execute instructions from
different explicit threads
 Interleave instructions from different threads
on shared pipelines or parallel execution on
parallel pipelines

+ Implicit multithreading is concurrent


execution of multiple threads extracted
from single sequential program
 Implicit threads defined statically by compiler
or dynamically by hardware

+ Approaches to Explicit
Multithreading
 Interleaved  Blocked
 Fine-grained  Coarse-grained
 Processor deals with two or  Thread executed until event
more thread contexts at a causes delay
time  Effective on in-order
 Switching thread at each processor
clock cycle  Avoids pipeline stall
 If thread is blocked it is
skipped  Chip multiprocessing
 Processor is replicated on a
 Simultaneous (SMT) single chip
 Instructions are  Each processor handles
simultaneously issued from separate threads
multiple threads to  Advantage is that the
execution units of available logic area on a
superscalar processor chip is used effectively

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Clusters
 Alternative to SMP as an approach to
providing high performance and high
availability
 Particularly attractive for server applications
 Defined as:
 A group of interconnected whole computers
working together as a unified computing resource
that can create the illusion of being one machine
 (The term whole computer means a system that
can run on its own, apart from the cluster)

+ Each computer in a cluster is called a node



 Benefits:
 Absolute scalability
 Incremental scalability
 High availability
 Superior price/performance
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Table 17.2
Clustering Methods: Benefits and
Limitations

+
Operating System Design
Issues
 How failures are managed depends on the clustering method
used
 Two approaches:
 Highly available clusters
 Fault tolerant clusters

 Failover
 The function of switching applications and data resources over from a failed
system to an alternative system in the cluster

 Failback
 Restoration of applications and data resources to the original system once
it has been fixed

 Load balancing
 Incremental scalability
 Automatically include new computers in scheduling
 Middleware needs to recognize that processes may switch between
machines
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Parallelizing Computation

Effective use of a cluster requires

executing software from a single
application in parallel
Three approaches are:

Parallelizing complier Parallelized Parametric

• Determines at compile application computing
time which parts of an • Application written from • Can be used if the
application can be the outset to run on a essence of the application
executed in parallel cluster and uses message is an algorithm or program
• These are then split off to passing to move data that must be executed a
be assigned to different between cluster nodes large number of times,
computers in the cluster each time with a different
set of starting conditions
or parameters

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Clusters Compared to SMP
 Both provide a configuration with multiple processors to
support high demand applications
 Both solutions are available commercially

SMP Clustering
 Easier to manage and  Far superior in terms of
configure incremental and absolute
scalability
 Much closer to the original
single processor model for  Superior in terms of
which nearly all availability
applications are written
 All components of the
 Less physical space and system can readily be made
lower power consumption highly redundant
 Well established and stable

+
Nonuniform Memory Access
(NUMA)
 Alternative to SMP and clustering
 Uniform memory access (UMA)
 All processors have access to all parts of main memory using loads and
stores
 Access time to all regions of memory is the same
 Access time to memory for different processors is the same

 Nonuniform memory access (NUMA)

 All processors have access to all parts of main memory using loads and
stores
 Access time of processor differs depending on which region of main
memory is being accessed
 Different processors access different regions of memory at different
speeds

 Cache-coherent NUMA (CC-NUMA)

 A NUMA system in which cache coherence is maintained among the
caches of the various processors
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
Motivation
SMP has practical limit to In clusters each node has
number of processors its own private main
that can be used memory
• Bus traffic limits to between 16 • Applications do not see a large
and 64 processors global memory
• Coherency is maintained by
software rather than hardware

Objective with NUMA is to

maintain a transparent
system wide memory
NUMA retains SMP flavor
while permitting multiple
while giving large scale
multiprocessor nodes,
multiprocessing
each with its own bus or
internal interconnect
system
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
NUMA Pros and Cons

 Main advantage of a CC-

NUMA system is that it can
deliver effective
performance at higher  Does not transparently look
levels of parallelism than like an SMP
SMP without requiring
major software changes  Software changes will be
required to move an
 Bus traffic on any individual operating system and
node is limited to a demand applications from an SMP to
that the bus can handle a CC-NUMA system

 Concern with availability

 If many of the memory
accesses are to remote
nodes, performance begins
to break down
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Deployment Models
 Community cloud
 Like a private cloud it is not
open to any subscriber
 Public cloud  Like a public cloud the
 The cloud infrastructure is resources are shared among a
made available to the number of independent
general public or a large orgaizations
industry group and is owned
by an organization selling  Hybrid cloud
cloud services  The cloud infrastructure is a
 Major advantage is cost composition of two or more
clouds that remain unique
 Private cloud entities but are bound
together by standardized or
 A cloud infrastructure
proprietary technology that
implemented within the
enables data and application
internal IT environment of the
portability
organization
 Sensitive information can be
 A key motivation for opting placed in a private area of the
for a private cloud is security cloud and less sensitive data
can take advantage of the cost
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved. benefits of the public cloud
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+
Cloud Computing Reference
Architecture
 NIST SP 500-292 establishes a reference architecture,
described as:

“The NIST cloud computing reference architecture focuses on

the requirements of “what” cloud services provide, not a “how
to” design solution and implementation. The reference
architecture is intended to facilitate the understanding of the
operational intricacies in cloud computing. It does not
represent the system architecture of a specific
cloud computing system; instead it is a tool for describing,
discussing, and developing a system-specific
architecture using a common framework of reference.”

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.
+ Summary Parallel
Processing
Chapter 17
 Multithreading and chip
multiprocessors
 Implicit and explicit multithreading
 Multiple processor organizations  Approaches to explicit multithreading
 Types of parallel processor systems
 Parallel organizations
 Clusters
 Cluster configurations
 Symmetric multiprocessors  Operating system design issues
 Organization  Cluster computer architecture
 Multiprocessor operating system  Blade servers
design considerations  Clusters compared to SMP

 Cache coherence and the MESI  Nonuniform memory access

protocol  Motivation
 Software solutions  Organization
 Hardware solutions  NUMA Pros and cons
 The MESI protocol  Cloud computing
 Cloud computing elements
© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.  Cloud computing reference

Slot28 CH17 ParallelProcessing 32 Slides
No ratings yet
Slot28 CH17 ParallelProcessing 32 Slides
32 pages
Parallel Prrocessor
No ratings yet
Parallel Prrocessor
12 pages
Pipeliningandvectorprocessing 140612142847 Phpapp01
No ratings yet
Pipeliningandvectorprocessing 140612142847 Phpapp01
53 pages
Chapter 8 - Parallel Processing
No ratings yet
Chapter 8 - Parallel Processing
50 pages
Parallel Processing
No ratings yet
Parallel Processing
28 pages
CH17 ParallelProcessing 32 Slides
No ratings yet
CH17 ParallelProcessing 32 Slides
32 pages
MULTIPROCTLPA
No ratings yet
MULTIPROCTLPA
99 pages
Unit Iv Parallelism
No ratings yet
Unit Iv Parallelism
80 pages
CH20 COA11e
No ratings yet
CH20 COA11e
40 pages
Lecture 06
No ratings yet
Lecture 06
48 pages
Multi Core
No ratings yet
Multi Core
7 pages
CH5 Parallel Processing
No ratings yet
CH5 Parallel Processing
30 pages
Slot28 CH17 ParallelProcessing 32 Slides
No ratings yet
Slot28 CH17 ParallelProcessing 32 Slides
32 pages
Lecture 04
No ratings yet
Lecture 04
25 pages
PART17
No ratings yet
PART17
45 pages
CA Chap7 Multicores Multiprocessors
No ratings yet
CA Chap7 Multicores Multiprocessors
42 pages
Lecture-7 SMP NUMA Cache Coherence
No ratings yet
Lecture-7 SMP NUMA Cache Coherence
34 pages
Unit 6 Mom
No ratings yet
Unit 6 Mom
23 pages
MODULE 4 HPC
No ratings yet
MODULE 4 HPC
41 pages
CH17 COA9e Parallel Processing
No ratings yet
CH17 COA9e Parallel Processing
52 pages
CH02 COA10e.performance Issues
No ratings yet
CH02 COA10e.performance Issues
19 pages
Unit 6
No ratings yet
Unit 6
36 pages
Grade 3 Countable Nouns C
No ratings yet
Grade 3 Countable Nouns C
2 pages
Computer Architecture Assignment Group 3
No ratings yet
Computer Architecture Assignment Group 3
15 pages
CSA Presentation
No ratings yet
CSA Presentation
37 pages
Chapter 11
No ratings yet
Chapter 11
33 pages
Week 5
No ratings yet
Week 5
52 pages
ACA UNIT-5 Notes
No ratings yet
ACA UNIT-5 Notes
15 pages
Arkom 13-40275
No ratings yet
Arkom 13-40275
32 pages
Lec13 Multiprocessors
No ratings yet
Lec13 Multiprocessors
69 pages
atII Bks Lec 2021 31 32
No ratings yet
atII Bks Lec 2021 31 32
16 pages
Parallel Processing Parallel Processing
No ratings yet
Parallel Processing Parallel Processing
64 pages
Multiprocessors
No ratings yet
Multiprocessors
39 pages
Advanced Fuse Implementation: Definitive Reference for Developers and Engineers
From Everand
Advanced Fuse Implementation: Definitive Reference for Developers and Engineers
Richard Johnson
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
HPA - Notes
No ratings yet
HPA - Notes
5 pages
Improve Your Ielts Writing Skills
No ratings yet
Improve Your Ielts Writing Skills
116 pages
Thesis Chapter 1 Contents
100% (3)
Thesis Chapter 1 Contents
9 pages
Unit 7 - Parallel Processing Paradigm
No ratings yet
Unit 7 - Parallel Processing Paradigm
26 pages
Unit VI
No ratings yet
Unit VI
50 pages
CH17 COA9e
No ratings yet
CH17 COA9e
51 pages
Parallel Processing:: Multiple Processor Organization
No ratings yet
Parallel Processing:: Multiple Processor Organization
24 pages
SYS600 - View Writer's Guide
No ratings yet
SYS600 - View Writer's Guide
132 pages
Computer System: Operating Systems: Internals and Design Principles
No ratings yet
Computer System: Operating Systems: Internals and Design Principles
62 pages
CS Chap7 Multicores Multiprocessors Clusters
No ratings yet
CS Chap7 Multicores Multiprocessors Clusters
65 pages
The Neuroanatomy of Languaje
No ratings yet
The Neuroanatomy of Languaje
8 pages
L7 Multicore 1
No ratings yet
L7 Multicore 1
50 pages
Final Year Proposal
No ratings yet
Final Year Proposal
21 pages
William Stallings Computer Organization and Architecture 10 Edition
No ratings yet
William Stallings Computer Organization and Architecture 10 Edition
34 pages
Multi Processors and Thread Level Parallelism
No ratings yet
Multi Processors and Thread Level Parallelism
74 pages
Graph Based Algorithms
No ratings yet
Graph Based Algorithms
10 pages
Slot28 CH17 ParallelProcessing 32 Slides
No ratings yet
Slot28 CH17 ParallelProcessing 32 Slides
32 pages
2ad6a430 1637912349895
No ratings yet
2ad6a430 1637912349895
51 pages
IT Unit 7 - Understand IO Streams and Applets in Java
No ratings yet
IT Unit 7 - Understand IO Streams and Applets in Java
52 pages
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
No ratings yet
Cs405-Computer System Architecture: Module - 1 Parallel Computer Models
91 pages
Denis Villeneuve Film Theorist or Cinema
No ratings yet
Denis Villeneuve Film Theorist or Cinema
10 pages
Czasy I Czasowniki - ANGIELSKI - E-Book
No ratings yet
Czasy I Czasowniki - ANGIELSKI - E-Book
12 pages
Multiprocessor
No ratings yet
Multiprocessor
45 pages
Introduction To Parallel Processing
No ratings yet
Introduction To Parallel Processing
49 pages
William Stallings Computer Organization and Architecture 9 Edition
No ratings yet
William Stallings Computer Organization and Architecture 9 Edition
51 pages
CH17-COA10e - Parallel Processing
No ratings yet
CH17-COA10e - Parallel Processing
45 pages
08 English 23
No ratings yet
08 English 23
2 pages
ORS Release Notes Cognate Groupings Aug2023 V0.1
No ratings yet
ORS Release Notes Cognate Groupings Aug2023 V0.1
6 pages
CICS Web Services: Somnath Niyogi Dec 15th, 2014 (Mon)
No ratings yet
CICS Web Services: Somnath Niyogi Dec 15th, 2014 (Mon)
37 pages
The Routledge Companion To Scenography (Arnold Aronson) (Z-Library)
100% (1)
The Routledge Companion To Scenography (Arnold Aronson) (Z-Library)
625 pages
William Stallings Computer Organization and Architecture: Parallel Processing
No ratings yet
William Stallings Computer Organization and Architecture: Parallel Processing
40 pages
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
Net s7-1200 Isoontcp en
No ratings yet
Net s7-1200 Isoontcp en
22 pages
Sermon On The Mind of Christ
No ratings yet
Sermon On The Mind of Christ
3 pages
L32 SMP
No ratings yet
L32 SMP
47 pages
113 Telling-Time US
No ratings yet
113 Telling-Time US
27 pages
Multi-Processor-Parallel Processing PDF
No ratings yet
Multi-Processor-Parallel Processing PDF
12 pages
Chapter 1 (Parallel Computer Models)
No ratings yet
Chapter 1 (Parallel Computer Models)
20 pages
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
No ratings yet
Multiprocessors and Multithreading: CS151B/EE M116C Computer Systems Architecture
13 pages
History of Cabiao
No ratings yet
History of Cabiao
5 pages
EmTech Module - Q1 by GRascia - 5
No ratings yet
EmTech Module - Q1 by GRascia - 5
14 pages
Multi-Processor / Parallel Processing
No ratings yet
Multi-Processor / Parallel Processing
12 pages
Intonation Summary
No ratings yet
Intonation Summary
9 pages
Parallel Arch 2
No ratings yet
Parallel Arch 2
9 pages
HRM627 Mid Term
No ratings yet
HRM627 Mid Term
2 pages
Multiprocessing: Flynn's Classification (1966)
No ratings yet
Multiprocessing: Flynn's Classification (1966)
8 pages
Mass Songs
No ratings yet
Mass Songs
7 pages
Fast Dolphin - Resume Template
No ratings yet
Fast Dolphin - Resume Template
4 pages
Filled Out: People Also Search For
No ratings yet
Filled Out: People Also Search For
1 page
Plagiarism Test
No ratings yet
Plagiarism Test
2 pages
French Revolution Unit
No ratings yet
French Revolution Unit
4 pages
Teaching Observation-Jennifer Gerlach
No ratings yet
Teaching Observation-Jennifer Gerlach
3 pages
Introduce Someone
100% (1)
Introduce Someone
3 pages
UNIT - II Control Structures 16 Marks
No ratings yet
UNIT - II Control Structures 16 Marks
2 pages
Assignment On Objective Type
No ratings yet
Assignment On Objective Type
14 pages

CH17 COA10e

Uploaded by

CH17 COA10e

Uploaded by

+

© 2016 Pearson Education, Inc., Hoboken,

 Single instruction, single data  Multiple instruction, single

 Single instruction, multiple data  Multiple instruction, multiple

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Main drawback is performance

 Each processor should have cache memory

 Leads to problems with cache coherence

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Reliability and fault tolerance

 Attempt to avoid the need for additional hardware

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Directory stored in Creates central

Requests are Appropriate

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Suited to bus-based multiprocessor because the shared bus

 Two basic approaches have been explored:

 Multiple readers, but only one writer at a time

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Can be multiple readers and writers

When a processor wishes to update a shared

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Increase performance by increasing clock frequency and

Thread is concerned with

+ Implicit multithreading is concurrent

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

+ Each computer in a cluster is called a node

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

Effective use of a cluster requires

Parallelizing complier Parallelized Parametric

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Nonuniform memory access (NUMA)

 Cache-coherent NUMA (CC-NUMA)

Objective with NUMA is to

 Main advantage of a CC-

 Concern with availability

“The NIST cloud computing reference architecture focuses on

© 2016 Pearson Education, Inc., Hoboken, NJ. All rights reserved.

 Cache coherence and the MESI  Nonuniform memory access

You might also like