0% found this document useful (0 votes)
0 views

notes-1

The lecture notes cover the principles of recursion and divide-and-conquer algorithms, illustrating their application through examples such as binary search and merge sort. It discusses the process of solving problems by breaking them down into smaller sub-problems, combining their solutions, and analyzing their running times using recurrence relations and the Master theorem. Additionally, it explores algorithms for finding majority elements in arrays and introduces Strassen's algorithm for matrix multiplication.

Uploaded by

nakbeta4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

notes-1

The lecture notes cover the principles of recursion and divide-and-conquer algorithms, illustrating their application through examples such as binary search and merge sort. It discusses the process of solving problems by breaking them down into smaller sub-problems, combining their solutions, and analyzing their running times using recurrence relations and the Master theorem. Additionally, it explores algorithms for finding majority elements in arrays and introduces Strassen's algorithm for matrix multiplication.

Uploaded by

nakbeta4
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Lecture Notes on Recursion

*
Hoa T. Vu

1 Introduction
The main ideas behind recursive/divide-and-conquer algorithms are as follows.

ˆ Suppose we have a problem of size n.

ˆ We try to break it into one or more sub-problems of size much smaller than n and solve them
recursively.

ˆ Combine the solutions of the sub-problems to solve the original problem.

ˆ When n is small enough (e.g., a small constant), the problem can be solved directly. This is
called the base case.

2 A puzzle
Consider a 2n by 2n (n ≥ 1) chessboard with one missing square. The task is to tile this board with
L-shaped dominoes (2 by 2 with one square removed) such that no two dominoes overlap and no
domino goes out of the board’s boundary. See the figure below for an example.

Figure 1: An example of tiling an 8 by 8 board with 1 square missing using L-shaped dominoes.

How do we go about solving this problem? One approach is to start with the simplest case that
is 2 by 2. This case is trivial since we can orient the domino to avoid the missing square.
*
San Diego State University, [email protected]

1
Consider the 4-by-4 case. Pause for a while and think about how you would reduce this to the
trivial 2-by-2 case.
You can divide the board into four quadrants of size 2-by-2. One of these quadrants has a missing
square and you know how to tile it. How about the other three? You can put an L-shaped tile in
the center such that it overlaps the other three quadrants. Now, you are left with tiling four 2-by-2
quadrants, each of which has one missing square. Thus, we know how to do this for all 4-by-4 cases.
Can you generalize this to 8-by-8, 16-by-16, and so on?

3 Some easy examples


3.1 Binary Search
Let us consider the binary search algorithm. The input is a sorted array of numbers A[1 . . . n] where
A[1] ≤ A[2] ≤ . . . ≤ A[n] and a number x. We want to output YES if x is in A and NO otherwise.
The idea is simple: Look at the middle entry A[n/2]. If x ≤ A[n/2], then if x ∈ A, it must be in
the left half A[1 . . . n/2]. Otherwise, if it is in A, it must be in the right half A[n/2 + 1 . . . n]. So we
can recursively search either the left half or the right half accordingly. Note that the size of the new
sub-problem is now half of the size of the original problem, i.e., n/2 vs n.
Algorithm 1: Binary search
1 Function BinarySearch(i, j): Check if x is in A[i . . . j]
2 if i = j then
/* Base case */
3 Return YES if A[i] = x. Otherwise, return NO.
4 else
5 Let m = ⌊(i + j)/2⌋.
6 if x ≤ A[m] then
7 Return BinarySearch(i, m).
8 else
9 Return BinarySearch(m + 1, j).
What is the running time? An informal analysis is the following.

ˆ We start with a search range of size n. The search range shrinks by half at each recursion level.

ˆ In each recursion level, there is constant (i.e., O(1)) non-recursive work.

ˆ How many levels until we reach the base case (search range is 1)? Note that the search range
at recursion level i is n/2i . Solve for n/2i = 1 =⇒ i = log n.

ˆ So the running time is (number of recursion levels)×(amount of non-recursive work per level) =
O(log n) × O(1) = O(log n).

3.2 Merge Sort


We now look at merge sort which runs in O(n log n) time. First, let us look at the merge procedure
that merges two sorted arrays A and B.

2
Algorithm 2: Merge two sorted arrays
1 Function Merge(A[1 . . . n], B[1 . . . m])
2 j = j1 = j2 = 1.
3 A[n + 1] = B[m + 1] = ∞.
4 while j1 ≤ n or j2 ≤ m do
5 if A[j1 ] ≤ B[j2 ] then
6 C[j] = A[j1 ]
7 j1 = j1 + 1; j = j + 1.
8 else
9 C[j] = B[j2 ]
10 j2 = j2 + 1; j = j + 1.
11 end while
12 Return C.
The merge procedure above is correct because in each iteration, the smallest element among
A[j1 . . . n] and B[j2 . . . m] will be appended to the end of C.
The running time is O(n + m) because in each iteration, we increase j1 or j2 by 1 and we stop
when j1 = n and j2 = m. The running time per iteration is O(1). Thus, the total running time is
O(n + m).
Now, we are ready to describe the divide-and-conquer merge sort. Suppose we want to sort
A[1 . . . n]. The idea is that we recursively sort A[1 . . . ⌊n/2⌋] and A[⌊n/2⌋ + 1 . . . n] separately and
then merge them.
Algorithm 3: Merge sort
1 Function MergeSort(A[1 . . . n])
2 if n ≤ 1 then
3 Return A.
4 else
5 L = M ergeSort(A[1 . . . ⌊n/2⌋]).
6 R = M ergeSort(A[⌊n/2⌋ + 1 . . . n]).
7 C = M erge(L, R).
8 Return C.
The correctness is pretty clear. We know that L and R are the left and right halves sorted.
Hence, we can merge them together to form the sorted version of A.
Let us analyze the running time. Each recursion level does O(n) non-recursive work and calls
two sub-problem of half the size. Hence,

T (n) = 2T (n/2) + O(n).

We will show that this is O(n log n).

4 Solving recurrences
4.1 Recursion tree
Each node corresponds to a call to the algorithm. The running time is the total amount of non-
recursive work in the tree.

3
Example 1: Each level does the same amount of non-recursive work. For example,
consider the formula for merge sort: T (n) = 2T (n/2) + O(n) ≤ 2T (n/2) + cn for some constant c.
The non-recursive work for a node of size n is O(n).

From the tree, it is easy to see that at each recursion level, the amount of non-recursive work is
cn. There are at most log n levels until we reach the base case. Hence, the total running time is
O(n log n).
This is an example where each level is

Example 2: The non-recursive work done at each level decreases exponentially. For
example, T (n) = T (n/10) + T (n/5) + O(n2 ).

We can easy that the non-recursive work done at level i is c(1/20)i n2 . Hence, the total running
time is at most
log5 n ∞
X
i 2
X 1
c(1/20) n ≤ c(1/20)i n2 = cn2 = O(n2 ).
1 − 1/20
i=0 i=0

4
P∞ i
Here we recall the fact that for r < 1, we have i=0 r = 1/(1 − r).

Example 3: The non-recursive work done at each level increases exponentially. This is
a slightly more tricky case, but the rule of thumb is that the work done at the deepest level will
dominate the work done by all previous level. For example, consider

T (n) = 7T (n/2) + n2 .

Let’s draw the recursion tree.

The work done at the ith level is (7/4)i n2 . There are at most log2 n levels. Note that the work
at the last level dominates the rest. So set i = log2 n to get the running time.

O(n2 · (7/4)log2 n ) = O(n2 (7/4)log7/4 n·log2 7/4 ) = O(n2 · nlog2 (7/4) ) ≈ O(n2.807... ).

4.2 Master theorem


Theorem 1. If the recurrence is in the following form

T (n) = aT (n/b) + O(nd ).

Then,

d
O(n )
 if d > logb a
T (n) = O(nd log n) if d = logb a

O(nlogb a) if d < logb a.

The proof is a formalization of the above examples. Read section 2.2 in the book for a proof.

Exercise: Solve the some previous recurrence using Master theorem

ˆ Binary search T (n) = T (n/2) + O(1),

ˆ Merge sort T (n) = 2T (n/2) + O(n),

5
ˆ Strassen’s algorithm (which we will cover later) T (n) = 7T (n/2) + n2 .

However, Master theorem cannot be used to solve something like this T (n) = T (n/10) + T (n/5) +
O(n2 ) since it is not in the form required by the theorem. So we need to rely on recursion tree in
these cases.

5 Finding the majority


Consider an array A[1 . . . n] of elements that are not necessarily numbers. But you can compare two
elements to tell if they are the same. An element is the majority if it occurs more than n/2 times in
A. We want to return the majority element if there is one.
A naive algorithm would be for each A[i], scan through A to count its occurrences. Obviously,
we will find the majority if there is one. The running time would be O(n2 ).
Consider the following divide and conquer algorithm. If A has an odd number of elements, we
can check if A[1] is the majority, if so, return A[1]. If not, remove A[1] since it’s not the majority
and this won’t change the answer. So now, A has an even number of element.
Let L = A[1 . . . n/2] and R = [n/2 + 1 . . . n]. If A has a majority, then that element must be
the majority of either L or R. Proof: suppose that A has a majority element z but z is neither the
majority of L or R. Then z occurs ≤ n/4 times in L and ≤ n/4 times in R. Hence, z occurs ≤ n/2
times in A which means z is not the majority of A which is a contradiction. So we can recursively
find the majority of L and R, called x and y. Then, we check if x or y is the majority of A.
Algorithm 4: Find majority
1 Function Majority(A[1 . . . n])
/* take care of the case A has an odd number of elements */
2 if |A| is odd then
3 Count A[1] in A. This takes O(n) time.
4 if count > n/2 then
5 return A[1]
6 else
7 A = A[2 . . . n]
8 end if
9 L = A[1 . . . n/2]
10 R = A[n/2 + 1 . . . n]
11 x = M ajority(L), y = M ajority(R)
12 Count x in A and count y in A (take O(n) time). If x (or y) is the majority of A, return
x (or y). Otherwise, return “no majority”.
The running time is described by the recurrence T (n) = 2T (n/2) + O(n) which is O(n log n) by
the Master theorem.

An even faster algorithm. We modify function Majority(A[1 . . . n]).

ˆ Suppose |A| is odd, check if A[1] is the majority, if so, return A[1]. Otherwise, throw away
A[1] as before. This also covers the base case n = 1.

ˆ Pair up elements and produce B as described. Recurse x ← Majority(B). Then count the
number of occurrences of x in A and return x if the number of occurrences is more than n/2.
Otherwise, return no-majority.

6
ˆ Pair up (A[1], A[2]), (A[3], A[4]),. . . ,(A[n − 1], A[n]). For each pair, if the two elements are the
same, keep one copy. Otherwise, discard both.

ˆ Let m be the majority element if there is one. Let x be the number of pairs with two m’s. Let
y be the number of pairs with exactly one m. Let z be the number of pairs with two different
elements and none is m. Let w be the number of pairs with two similar elements that are not
m.

ˆ The total number of elements is n. Hence,

2x + 2y + 2z + 2w = n.

We also know that 2x + y > n/2 since m must occur more than n/2 times to be the majority
element. Suppose w ≥ x =⇒ 2w + y ≥ 2x + y, then,

2x + 2y + 2z + 2w ≥ (2x + y) + (2w + y) + 2z > n + 2z.

which is a contradiction. Hence, w < x. In the new list B, we have w + x elements while w < x.
Note that m appears x times in B. Hence, m is still a majority element in B. Therefore, the
algorithm is correct.

This suggests the following algorithm


Algorithm 5: Find majority
1 Function Majority(A[1 . . . n])
/* take care of the case A has an odd number of elements */
2 if |A| is odd then
3 Count A[1] in A. This takes O(n) time.
4 if count > n/2 then
5 return A[1]
6 else
7 A = A[2 . . . n]
8 end if
9 for i = 1, 2, 4, . . . , n/2 do
10 if A[2i] = A[2i − 1] then Add a copy of A[i] to B
11 end for
12 x = M ajority(B)
13 Count x in A. If x is the majority of A, return x . Otherwise, return “no majority”.
The running time is given by the following recurrence T (n) = T (n/2) + O(n) which is O(n) by
master theorem.

6 Strassen’s algorithm
Given 2 n-by-n matrices X and Y . The (i, j) entry (where i, j ∈ {1, . . . , n}) of the product matrix
Z = XY is defined as
Xn
Zij = Xik Ykj .
k=1

More generally, we can multiply an a-by-b matrix with a b-by-c matrix to get an a-by-c matrix.

7
Naive algorithm. What is the running time of the naive algorithm which computes each entry
of C using the above formula?
There are n2 entries to compute, each of which takes O(n) times. Hence, the running time is
O(n3 ).
We now describe a more clever algorithm by Strassen. The first observation (which is not hard
to prove but we will just assume that) is that matrix multiplication can be done blockwise.

     
A B E F AE + BG AF + BH
· =
C D G H CE + DG CF + DH

Check that it’s true for this example where we use blocks of size 2.

   
1 2 3 2 1 0 0 1
5 2 1 0 1 1 2 1
 ·  = ...
1 1 1 0 1 2 3 4
2 3 5 7 5 6 7 8

Let T (n) be the time to multiply two n-by-n matrices. Note that T (n) ≥ O(n2 ) because
just reading the input takes Ω(n2 ) time. The above approach yields the following recurrence:
T (n) = 8T (n/2) + O(n2 ). This is because
ˆ We need to compute 8 different n/2-by-n/2 matrix multiplications.

ˆ Adding them up (e.g., AE + BG) takes O(n2 ) time.


Applying Master theorem, this gives us O(n3 ) which is not any better than the trivial algorithm.
However, if we consider the following 7 n/2-by-n/2 matrix multiplications (step 1).
ˆ P1 = A(F − H). Note that F − H can be computed in O(n2 ) time since we need to do
(n/2)2 = O(n2 ) subtractions. So, P1 = A(F −H) can be computed in T (n/2)+O(n2 ) = T (n/2)
time. Similarly,

ˆ P2 = (A + B)H.

ˆ P3 = (C + D)E.

ˆ P4 = D(G − E).

ˆ P5 = (A + D)(E + H).

ˆ P6 = (B − D)(G + H).

ˆ P7 = (A − C)(E + F ).
Recall that the output matrix looks like this
 
AE + BG AF + BH
CE + DG CF + DH

Now, compute the following (step 2):


ˆ AE + BG = P5 + P4 − P2 + P6 . Note that computing P5 + P4 − P2 + P6 takes O(n2 ) time
given that we have computed P1 , . . . , P7 .

8
ˆ AF + BH = P1 + P2 .

ˆ CE + DG = P3 + P4 .

ˆ CF + DH = P1 + P5 − P3 − P7 .

Alright, the new recurrence is

T (n) = 7T (n/2) + O(n2 ) .


| {z } | {z }
step 1 step 2

By Master theorem, this leads to T (n) = O(nlog2 7 ) ≈ O(n2.81 ).

7 Linear time selection


Suppose we want to find the kth smallest element (or kth rank element for short) in an array
A[1 . . . n]. Let us denote the algorithm by Seℓect(A[1 . . . n], k).
A reasonable approach is to sort A and output A[k] which takes O(n log n) time. Here we will
try to have an O(n) time algorithm. The idea is as follows:

ˆ Pick some pivot x and partition A into two parts L which contains all entries that are at most
x, and R which contains all entries that are larger than x.

Exercise: Show that P artition(A, x) can be done in O(n) time.

ˆ If k ≤ size(L), we recursively find the kth smallest in L: Seℓect(L, k).

ˆ If k > size(L), we recursively find the (k − size(L))th smallest element in R: Seℓect(R, k −


size(L)).

ˆ Base case: if n = 1, return A[1].

Now, the main difficulty is to pick a good pivot such that L and R are balanced in size (i.e.,
none of them is too big or too large). We do the followings:

ˆ Divide A into ⌈n/5⌉ blocks of 5: A[1 . . . 5],A[6 . . . 10],. . . ,A[5i − 4 . . . 5n]. If the last block is
not full, just throw in some ∞.

ˆ For each block i, i.e., A[5i−4 . . . 5i], find its median mi . Let M = [m1 , m2 , . . . , m⌈n/5⌉ ]. Finding
the median of each block takes O(1) time and there are n/5 blocks so the running time to
compute M is O(n).

ˆ We use the median of M as the pivot. But how do we find the median of M ? We recursively
call Seℓect(M, ⌊n/2⌋).

The overall algorithm is as follows:

9
Algorithm 6: Quick Select
1 Function Select(A[1 . . . n], k)
2 if n = 1 then
3 Return A[1]
4 Divide A into ⌈n/5⌉ blocks of 5: A[1 . . . 5],A[6 . . . 10],. . . ,A[5i − 4 . . . 5n]. If the last block
is not full, just throw in some ∞. This takes O(n) time.
5 For each block i, i.e., A[5i − 4 . . . 5i], find its median mi . Let M = [m1 , m2 , . . . , m⌈n/5⌉ ].
This takes O(n) time.
6 x = Seℓect(M, ⌊n/2⌋)
7 L, R = P artition(A, x).
8 if k ≤ size(L) then
9 Return Seℓect(L, k).
10 else
11 Return Seℓect(R, k − size(L)).
We argued about the correctness. However, why does this algorithm run in O(n) time? Let T (n)
the running time for Seℓect with input array of size n. Let us compute T (n).

ˆ There are ⌈n/5⌉ block medians. Hence, Seℓect(M, ⌊n/2⌋) takes T (n/5) time.

ˆ Each block median is larger than 2 other elements in its block. Hence, p is larger than at least
3 ⌊⌈n/5⌉ /2⌋ ≈ 3n/10 elements in A. Thus, size(L) ≥ 3n/10 =⇒ size(R) = n − size(L) ≤
7n/10. Symmetrically, size(L) ≤ 7n/10.

We have the following recurrence:

T (n) = T (7n/10) + T (n/5) + O(n) .


| {z } | {z } | {z }
either line 9 or line 10 line 6 non-recursive work

Exercise: Use the recursion tree method to show that at level i of the recursion tree, the
non-recursive work done is (9/10)i cn where c is some constant. Hence, the running time is at most

X
cn (9/10)i = cn/(1 − 9/10) = O(n).
i=1

10

You might also like