Par Seq Algorithms
Par Seq Algorithms
• Theoretical:
– challenging problems
Efficient and optimal parallel algorithms
ASSUMPTION: at each time unit each Pi can read a memory cell, make an internal
computation and write another memory cell.
• PRAM machine
– time: time taken by the longest running processor
– hardware: maximum number of active processors
Two Technical Issues for PRAM
1 0 0 0 0 0 0
PRAM CREW
Membership problem
• p processors PRAM with n numbers (p ≤ n)
• Does x exist within the n numbers?
• P0 contains x and finally P0 has to know
Algorithm
step1: Inform everyone what x is
step2: Every processor checks [n/p] numbers and sets a flag
step3: Check if any of the flags are set to 1
One more time about
PRAM model
• N synchronized processors
• Shared memory
– EREW, ERCW,
– CREW, CRCW
• Constant time
– access to the memory
– standard multiplication/addition
– Communication
(implemented via access to shared memory)
Two problems for PRAM
Problem 1. Min of n numbers
Problem 2. Computing a position of the first
one in the sequence of 0’s and 1’s.
How fast we can compute with many processor and how to reduce
the number of processors?
Min of n numbers
Sequential algorithm
…
Cost = 1 n
?
Sequential vs. Parallel
At least n comparisons should
be performed!!!
• NOWDAYS….
Give me a parallel machine with
enough processors and I will find the
smallest number in any giant set in a
constant time!
Parallel solution 1
Min of n numbers
• Comparisons between numbers can be done independently
• The second part is to find the result using concurrent write mode
• For n numbers ----> we have ~ n2 pairs
1 i j n
M[1..n] 000000000000000000000000000000000000000000000000
1 0
Algorithm A1
for each 1 i n do in parallel
M[i]:=0
for each 1 i,j n do in parallel
if ij C[i] C[j] then M[j]:=1
for each 1 i n do in parallel
if M[i]=0 then output:=i
From n processors to n 2 1+1/2
A1 A1 A1 A1 A1 A1 A1 A1 A1 A1
A1
A2 A2 A2 A2 A2 A2 A2 A2 A2 A2
A2
00000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000010000000000000000001
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000100000000000000000000000000000000000000000001000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000001000000100000011111111
00101000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
11111111000000000000000000000001000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00000001
0000000000000000000000000000000000000000000000000000000000000100000000000000000000000000000000000000000000000000
00000000000000000000000000000000000000000000000000000000000000001000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00010000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000001000000000100000000000000000000000000000000000000000000000000000001000
01101000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
000000000000000000000000111111111111111111111111111111000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
00010100
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000001000000111111111111111100000000000000000000
0001000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
0000001000000000000000000100000000000000000000000000100000000000000000000000000000000000000000001000000000000000
0000000000000000000000000000000000000000100000010000001111111111111111000000000000000000000001000000000000000000
0000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000
Problem 2.
Computing a position of the first one in the sequence of 0’s and 1’s.
Algorithm A
(2 parallel steps and n2 processors) 1 1
for each 1 i<j n do in parallel
if C[i] =1 and C[j]=1 then C[j]:=0
for each 1 i n do in parallel
if C[i] =1 then FIRST-ONE-POSITION:=i 1 0
FIRST-ONE-POSITION(C)=4 for
the input array After the first parallel step
C will contain a single
C=[0,0,0,1,0,0,0,1,1,1,0,0,0,1] element 1
Reducing number of processors
Algorithm B –
it reports if there is any one in the
table.
1 1
000000000000000000
There-is-one:=0
for each 1 i n do in parallel
if C[i] =1 then There-is-one:=1 1
Now we can merge two algorithms A and B
B B B B B B B B B B
A
Complexity
• We apply an algorithm A twice and each time
to the array of length n
which need only ( n )2 = n processors
• The time is O(1) and number of processors is n.
Tractable and intractable problems
for parallel computers
P (complexity)
• In computational complexity theory, P is the complexity class
containing decision problems which can be solved by a
deterministic Turing machine using a polynomial amount of
computation time, or polynomial time.