0% found this document useful (0 votes)
6 views

1.3 Abstract Machine Models in Parallel Computing

Uploaded by

Sneh Mehra628
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

1.3 Abstract Machine Models in Parallel Computing

Uploaded by

Sneh Mehra628
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 48

Abstract Machine Models

Content

• Abstract Machine Models in Parallel Computing


• The Random Access Machine (RAM)
• The Parallel Random Access Machine (PRAM)
Exclusive Read Exclusive Write (EREW) PRAM
Concurrent Read Exclusive Write (CREW) PRAM:
Exclusive Read Concurrent Write (ERCW) PRAM:
Concurrent Read Concurrent Write (CRCW) PRAM:
Abstract Machine Models in Parallel Computing

• Abstract machine models are simplified representations of parallel computers


that capture their key features.
• These models help in the design, analysis, and comparison of parallel algorithms
by making certain assumptions about how a parallel computer operates.
• Even though these assumptions might not be entirely practical, they offer valuable
insights:
1. Understanding Inherent Parallelism: By designing algorithms for these
models, one can understand how much of a problem can be parallelized, which
parts can be executed simultaneously, and which require sequential processing.
Cont…

2. Comparing Computational Powers: These models allow for the comparison of


different parallel computing architectures, determining which type is more efficient
or suitable for specific problems.
3. Choosing the Right Architecture: By analyzing how different models handle
various computational tasks, it becomes easier to select the most appropriate parallel
computing architecture for a given problem.

Note: Now we see various abstract machine models for parallel computers.
The Random Access Machine (RAM)

Fig 3.1 RAM model [1]


Cont…

1. This model of computing abstracts the sequential computer.


2. A memory unit with M locations. Theoretically speaking, M can be unbounded.
3. A processor that operates under the control of a sequential algorithm. The
processor can read data from a memory location, write to a memory location,
and can perform basic arithmetic and logical operations.
4. A Memory Access Unit (MAU), which creates a path from the processor to an
arbitrary location in the memory.
5. The processor provides the MAU with the address of the location that it wishes
to access and the operation (read or write) that it wishes to perform.
6. The MAU uses this address to establish a direct connection between the
processor and the memory location.
Cont…

Any step of an algorithm for the RAM model consists of (up to) three basic phases
namely:
1. Read: The processor reads a datum from the memory. This datum is usually
stored in one of its local registers.
2. Execute: The processor performs a basic arithmetic or logic operation on the
contents of one or two of its registers.
3. Write: The processor writes the contents of one register into an arbitrary
memory location.
Note: For the purpose of analysis, we assume that each of these phases takes
constant, i.e., O(1) time.
The Parallel Random Access Machine (PRAM)

Fig 3.2 PRAM model [2]


Cont…

The PRAM is one of the popular models for designing parallel algorithms.

1. A set of N(P1, P2, ..., PN) identical processors. In principle, N is unbounded.


2. A memory with M locations which is shared by all the N processors. Again, in
principle, M is unbounded.
3. An MAU which allows the processors to access the shared memory
Cont…

As in the case of RAM, each step of an algorithm here consists of the following
phases:
1. Read: (Up to) N processors read simultaneously (in parallel) from (up to) N
memory locations (in the common memory) and store the values in their local
registers.
2. Compute: (Up to) N processors perform basic arithmetic or logical operations on
the values in their registers.
3. Write: (Up to) N processors write simultaneously into (up to) N memory
locations from their registers.
Cont…

• Each of the phases, READ, COMPUTE, WRITE, is assumed to take O(1) time as
in the case of RAM.
• Notice that not all processors need to execute a given step of the algorithm. When
a subset of processors execute a step, the other processors remain idle during that
time.
• The algorithm for a PRAM has to specify which subset of processors should be
active during the execution of a step.
• In the above model, a problem might arise when more than one processor tries to
access the same memory location at the same time.
Cont…

The PRAM model can be subdivided into four categories based on the way
simultaneous memory accesses are handled.
Exclusive Read Exclusive Write (EREW) PRAM: In this model, every access to a
memory location (read or write) has to be exclusive. This model provides the least
amount of memory concurrency and is therefore the weakest.
Concurrent Read Exclusive Write (CREW) PRAM: In this model, only write
operations to a memory location are exclusive. Two or more processors can
concurrently read from the same memory location. This is one of the most
commonly used models.
Cont…

Exclusive Read Concurrent Write (ERCW) PRAM: This model allows multiple
processors to concurrently write into the same memory location. The read operations
are exclusive. This model is not frequently used and is defined here only for the sake
of completeness.
Concurrent Read Concurrent Write (CRCW) PRAM: This model allows both
multiple read and multiple write operations to a memory location. It provides the
maximum amount of concurrency in memory access and is the most powerful of the
four models.
Exclusive Read Exclusive Write (EREW) PRAM

Key Characteristics of EREW PRAM


Exclusive Read: No two processors can read from the same memory location
simultaneously. Each memory read operation must be exclusive to a single
processor at any given time.
Exclusive Write: No two processors can write to the same memory location
simultaneously. Each memory write operation must be exclusive to a single
processor at any given time.
Cont…

Why Use EREW PRAM?


Synchronization: The EREW model forces synchronization among processors,
reducing the risk of data inconsistencies that can arise when multiple processors
read or write to the same memory location concurrently.
Simple to Implement: The strict exclusivity rules make it easier to avoid race
conditions, where the outcome depends on the sequence or timing of other
uncontrollable events.
Example of EREW PRAM

Let's consider a simple example to illustrate the EREW PRAM model.


Problem: Sum Elements of an Array
Suppose we have an array of integers, and we want to calculate the sum of its
elements using multiple processors. We will use EREW PRAM to ensure that no
two processors read from or write to the same memory location at the same time.
Array: [1, 2, 3, 4, 5, 6, 7, 8]
Number of Processors: 4
Shared Memory: A variable sum to store the total sum.
Cont…

Steps:
1. Divide the Array: Divide the array into equal parts for each processor. If the
array has 8 elements and 4 processors, each processor will handle 2 elements.
Processor P1: Handles [1, 2]
Processor P2: Handles [3, 4]
Processor P3: Handles [5, 6]
Processor P4: Handles [7, 8]
Cont…

2. Each Processor Computes Local Sum: Each processor reads its assigned
elements and computes a local sum.
P1 computes local_sum1 = 1 + 2 = 3
P2 computes local_sum2 = 3 + 4 = 7
P3 computes local_sum3 = 5 + 6 = 11
P4 computes local_sum4 = 7 + 8 = 15
Cont…

Write to Shared Memory: Each processor, one at a time, writes its local sum to a
shared sum variable. The exclusive write rule is enforced, so only one processor can
write to sum at a time.
Initially, sum = 0.
P1 writes: sum = sum + local_sum1 = 0 + 3 = 3
P2 writes: sum = sum + local_sum2 = 3 + 7 = 10
P3 writes: sum = sum + local_sum3 = 10 + 11 = 21
P4 writes: sum = sum + local_sum4 = 21 + 15 = 36
Summary of EREW PRAM model

In the EREW PRAM model:


• Each read and write operation is exclusive to one processor at a time.
• The processors work in a coordinated manner to avoid conflicts over shared
memory access.
• This model simplifies the design of parallel algorithms by ensuring there is no
concurrent access to any memory location, thus eliminating race conditions.
Concurrent Read Exclusive Write (CREW) PRAM

The Concurrent Read Exclusive Write (CREW) PRAM is a parallel computing


model that allows multiple processors to read from the same memory location
simultaneously but restricts the ability to write such that only one processor can
write to any specific memory location at a time. This model balances the need for
parallelism with control over data integrity during write operations.
Key Characteristics of CREW PRAM

Concurrent Read (CR): Multiple processors can read from the same memory
location at the same time. This feature allows shared access to data without
contention, which is useful for parallel tasks that need to access common
information.
Exclusive Write (EW): Only one processor can write to a particular memory
location at any given time. This restriction prevents conflicts that could arise from
simultaneous write operations, ensuring data consistency.
Why Use CREW PRAM?

Increased Parallelism: By allowing concurrent reads, CREW PRAM enables a


higher degree of parallelism, which can lead to faster execution times for algorithms
that rely heavily on read operations.
Data Integrity: The exclusive write constraint ensures that the data remains
consistent and uncorrupted, as no two processors can overwrite the same memory
location simultaneously.
Example of CREW PRAM?

Problem: Summing Elements of an Array


Suppose we have an array of integers, and we want to compute the sum of all
elements using multiple processors in parallel. We'll use the CREW PRAM model
to do this.
Array: [3, 6, 2, 8, 4, 5, 9, 1]
Number of Processors: 4
Shared Memory: A variable total_sum to store the current sum.
Cont…
Step 1:
Initialization: Initialize total_sum to 0. Assume each processor will work on a section of
the array.
Initially, total_sum = 0.

Step 2:
Divide the Array: Split the array into parts for each processor to handle. If there are 8
elements and 4 processors, each processor will handle 2 elements.
Processor P1: [3, 6]
Processor P2: [2, 8]
Processor P3: [4, 5]
Processor P4: [9, 1]
Cont…
Step 3:
Concurrent Read Phase:
Each processor reads the current total_sum from shared memory. This read operation is
concurrent. All processors read total_sum = 0 initially.
Step 4:
Local Computation:
Each processor computes the sum of its assigned section.
P1 computes local_sum1 = 3 + 6 = 9
P2 computes local_sum2 = 2 + 8 = 10
P3 computes local_sum3 = 4 + 5 = 9
P4 computes local_sum4 = 9 + 1 = 10
Cont…
Step 5:
Exclusive Write Phase:
Each processor tries to add its local sum to total_sum in shared memory. This write
operation is exclusive, meaning only one processor can write to total_sum at any given
time.
Processor P1 writes first, total_sum = 0 + 9 = 9.
Processor P2 writes next, total_sum = 9 + 10 = 19.
Processor P3 writes next, total_sum = 19 + 9 = 28.
Processor P4 writes last, total_sum = 28 + 10 = 38.
Step 6:
Final Result: After all processors have had a chance to write, the total_sum holds the sum
of all elements in the array, which is 38.
Summary

Allows Concurrent Reads: Multiple processors can read from the same memory
location simultaneously, facilitating efficient data access in parallel algorithms.
Restricts Write Operations: Only one processor can write to a memory location at
a time, ensuring data consistency and preventing race conditions.
Practical Application: The CREW PRAM model is commonly used in scenarios
where data needs to be read by multiple processors for computation, but writes must
be controlled to avoid conflicts.
Exclusive Read Concurrent Write (ERCW) PRAM

It is a theoretical model of parallel computation that allows multiple processors to


write to the same memory location simultaneously, while read operations from any
memory location must be exclusive.
In this model, concurrent writes are permitted, but only one processor can read from
a given memory location at any given time. This model is useful in scenarios where
write operations are frequent and need to be performed concurrently to improve
efficiency.
Key Characteristics of ERCW PRAM

Exclusive Read (ER): Only one processor can read from a specific memory
location at a time. If multiple processors attempt to read from the same memory
location simultaneously, only one read operation is allowed.
Concurrent Write (CW): Multiple processors are allowed to write to the same
memory location at the same time. However, to resolve conflicts in concurrent write
operations, a predefined rule or protocol must be in place (e.g., sum the values, pick
the maximum, overwrite, etc.).
Why Use ERCW PRAM?

Efficiency in Writing: The ability to write concurrently allows the system to handle
multiple updates at once, which can be useful in aggregating results or updating
shared counters.
Scalability: In scenarios where multiple processors need to update a shared value
(like a sum or count), ERCW allows this to happen concurrently, improving
scalability and performance.
Example of ERCW PRAM

Problem: Aggregating Votes in an Election


Imagine we have a voting system where votes are counted in parallel using multiple
processors. Each processor counts the votes from a subset of ballots, and we want to
aggregate these counts into a shared memory location representing the total votes for
each candidate.
Cont…

Scenario:
We have 4 processors (P1, P2, P3, P4).
Each processor is assigned to count votes for a specific candidate from different
ballot boxes.
The shared memory has an array where each index corresponds to a candidate, and
the value at each index represents the total votes for that candidate.
We will focus on aggregating votes for Candidate A.
Cont…

Steps:
Initialization: Initialize the shared memory array for Candidate A's votes to 0. Let's
denote this memory location as totalVotesForA.
Exclusive Read Phase: Each processor reads its portion of the ballots to count
votes for Candidate A. Since the read operations are exclusive, each processor reads
different ballot subsets. There is no contention for read operations.
P1 reads 10 votes for Candidate A.
P2 reads 15 votes for Candidate A.
P3 reads 20 votes for Candidate A.
P4 reads 25 votes for Candidate A.
Cont…

Concurrent Write Phase: Now, all processors write their counts to the shared
memory location totalVotesForA simultaneously. ERCW allows this concurrent
write. To handle this, we define a rule to aggregate these votes (e.g., sum the
values).
All processors write to totalVotesForA at the same time:
P1 writes 10
P2 writes 15
P3 writes 20
P4 writes 25
Cont…

Conflict Resolution: Since multiple writes are happening concurrently, we use a


rule (e.g., summing) to handle the writes:
totalVotesForA = 10 + 15 + 20 + 25
totalVotesForA = 70
Final Result: The shared memory location totalVotesForA now holds the total
number of votes for Candidate A, which is 70.
Summary

The ERCW PRAM model:


Allows Concurrent Writes: Multiple processors can write to the same memory
location at the same time, which helps in tasks that require aggregation of data.
Restricts Read Operations: Only one processor can read from a specific memory
location at a time, which helps in avoiding inconsistencies during read operations.
Practical Applications: ERCW is useful in scenarios like vote counting, real-time
data aggregation, or updating shared counters where multiple processors need to
update a common value efficiently.
Conclusion

The ERCW PRAM model is an abstraction that facilitates understanding of how


multiple processors can work together to perform tasks that require frequent updates
to shared data.
By allowing concurrent writes but restricting reads to be exclusive, ERCW finds a
balance that can be beneficial in specific parallel processing scenarios.
Understanding this model helps in designing efficient parallel algorithms for data
aggregation and other similar tasks.
Concurrent Read Concurrent Write (CRCW)
PRAM
It is a theoretical model of parallel computation that allows multiple processors to
read from and write to the same memory location simultaneously.
This model provides the highest level of concurrency among PRAM models,
making it the most powerful in terms of parallel processing capabilities.
However, it also introduces challenges in handling simultaneous write operations,
which must be managed using predefined rules.
Key Characteristics of CRCW PRAM
Concurrent Read (CR): Multiple processors can read from the same memory location at
the same time without any restrictions. This feature allows for efficient sharing of data
among processors.
Concurrent Write (CW): Multiple processors can write to the same memory location
simultaneously. To handle conflicts that arise from concurrent write operations, various
strategies or rules are applied, such as:
Priority: One processor is given priority to write over others.
Common: All processors write the same value; if they don't, the behavior is undefined.
Arbitrary: One of the processors writing to the location is chosen randomly to succeed.
Sum: The values from all processors are summed and stored in the memory location.
Minimum/Maximum: The minimum or maximum value from all processors is written to
the memory location.
Example of CRCW PRAM

Problem: Finding the Maximum Element in an ArrayImagine we have an array of


integers, and we want to find the maximum element using multiple processors in
parallel. We will use the CRCW PRAM model to efficiently find the maximum
value.
Scenario:
We have an array A with 8 elements: [3, 15, 8, 23, 7, 12, 19, 5].
We use 4 processors (P1, P2, P3, and P4) to find the maximum value.
Each processor will have access to a shared memory location called max where the
maximum value will be stored.
Cont…

Steps:
Initialization: Set the shared memory location max to a very low value (e.g., -∞).
This ensures that any element in the array will be greater than this initial value.
Concurrent Read Phase: Divide the array among the processors. Each processor
reads a segment of the array to find the local maximum.
P1 reads [3, 15] and finds the local maximum as 15.
P2 reads [8, 23] and finds the local maximum as 23.
P3 reads [7, 12] and finds the local maximum as 12.
P4 reads [19, 5] and finds the local maximum as 19.
Cont…

Concurrent Write Phase: All processors try to write their local maximum to the
shared memory location max simultaneously. A predefined rule is used to resolve
the concurrent write:
a) Suppose we use the maximum rule: the highest value among those being
written is stored in max.
b) P1 tries to write 15, P2 tries to write 23, P3 tries to write 12, and P4 tries to write
19.
c) According to the maximum rule, max will be set to 23, as it is the highest value
among the ones being written.
Final Result: The shared memory location max now holds the value 23, which is
the maximum element in the array.
Summary

The CRCW PRAM model:


Allows Concurrent Reads and Writes: This feature provides maximum
concurrency, enabling efficient parallel processing.
Requires Conflict Resolution for Writes: Since multiple processors can write
simultaneously, predefined rules are necessary to handle conflicts and ensure
consistent outcomes.
Practical Applications: CRCW PRAM is suitable for applications that require
maximum parallelism, such as finding the maximum or minimum values, summing
elements, or other reduction operations in large data sets.
Conclusion

CRCW PRAM represents the most powerful model in the PRAM family due to its
ability to handle both concurrent reads and writes.
It is useful for designing highly parallel algorithms, especially in scenarios where
maximum concurrency is required.
Understanding this model helps in implementing efficient parallel algorithms for
problems that involve shared data access, such as maximum finding, sorting, or
other computational tasks that can benefit from high levels of parallelism.
Cont…

Steps:
Initialization: Set the shared memory location max to a very low value (e.g., -∞).
This ensures that any element in the array will be greater than this initial value.
Concurrent Read Phase: Divide the array among the processors. Each processor
reads a segment of the array to find the local maximum.
P1 reads [3, 15] and finds the local maximum as 15.
P2 reads [8, 23] and finds the local maximum as 23.
P3 reads [7, 12] and finds the local maximum as 12.
P4 reads [19, 5] and finds the local maximum as 19.
Thank You
References
1. Parallel Computers architecture and programming by V. Rajaraman
and C.S.R Murthy, Prentice Hall of India.

You might also like