0% found this document useful (0 votes)
3 views

Lecture 2 - Sorting and Asymptotic Analysis

The document discusses algorithm analysis, focusing on the order of growth, running time, and the use of asymptotic notations (O, Ω, Θ) to describe algorithm efficiency. It covers methods for analyzing code, including conditional statements and nested loops, and provides examples of typical running time functions. Additionally, it explains mathematical induction and common summation formulas used in algorithm analysis.

Uploaded by

ali2moh.04
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views

Lecture 2 - Sorting and Asymptotic Analysis

The document discusses algorithm analysis, focusing on the order of growth, running time, and the use of asymptotic notations (O, Ω, Θ) to describe algorithm efficiency. It covers methods for analyzing code, including conditional statements and nested loops, and provides examples of typical running time functions. Additionally, it explains mathematical induction and common summation formulas used in algorithm analysis.

Uploaded by

ali2moh.04
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 90

Lecture 2

Revision: Order of Growth + Analysis and Correctness


of Algorithms
Algorithm Analysis
• The amount of resources used by the algorithm
• Space
• Computational time
• Running time:
• The number of primitive operations (steps) executed
before termination
• Order of growth
• The leading term of a formula
• Expresses the behavior of a function toward infinity

2
Runtime Analysis
• In this course, we will typically use T(n) to refer to
worst-case running time, unless otherwise noted
• It is difficult to determine T(n) experimentally
• Too many possible inputs
• Don’t know a priori which input leads to worst-case
behavior
• We will instead determine T(n) theoretically- by
analyzing/counting number of primitive operations
in algorithm pseudocode

3
Order of growth
• Alg.: MIN (a[1], …, a[n])
m ← a[1];
for i ← 2 to n
if a[i] < m
then m ← a[i];
• Running time:
T(n) =1 [first step] + (n) [for loop] + (n-1) [if condition] +
(n-1) [the assignment in then] = 3n - 1
• Order (rate) of growth:
• The leading term of the formula
• Gives a simple characterization of the algorithm’s efficiency
• Expresses the asymptotic behavior of the algorithm
• T(n) grows like n

4
Analyzing Code
• Conditional statements
if C then S1 else S2

time  time(C) + Max( time(S1), time(S2))

• Loops
for i  a1 to a2 do S

time   time(Si) for all i

5
Analyzing Code
• Nested loops
for i = 1 to n do
for j = 1 to n do
sum = sum + 1

n n
T (n)   
n 2
j 1
1  n n Number_of_iterations *

time_per_iteration

i 1 i 1

6
Analyzing Code
• Nested loops
for i = 1 to n do
for j = i to n do
sum = sum + 1

n n n n n
T (n)   1  (n  i 1)  (n 1)   i
i 1 j i i 1 i 1 i 1

n(n  1) n(n  1) 2
n(n  1)   (n )
2 2
7
Typical Running Time
Functions
• 1 (constant running time):
• Instructions are executed once or a few times

• logN (logarithmic)
• A big problem is solved by cutting the original problem in smaller sizes, by a
constant fraction at each step

• N (linear)
• A small amount of processing is done on each input element

• N logN (log-linear)s
• A problem is solved by dividing it into smaller problems, solving them
independently and combining the solution
8
Typical Running Time
Functions
• N(1+c) where c is a constant satisfying 0 < c < 1 (super-linear)

• N2 (quadratic)
• Typical for algorithms that process all pairs of data items (double nested loops)

• N3 (cubic)
• Processing of triples of data (triple nested loops)

• NK (polynomial)

• kN (exponential)
• Few exponential algorithms are appropriate for practical use

9
Logarithms
• In algorithm analysis we often use the notation “log n”
without specifying the base

Binary logarithm
lg n log2 n log x y  y log x
Natural logarithm
ln n loge n log xy  log x  log y
x
lg k n (lg n ) k log  log x  log y
y
lg lg n lg(lg n )
loga x  loga b logb x

a logb x  x logb a

10
Some Common Functions

• Floors and ceilings

– x : the greatest integer less than or equal to x

– x : the least integer greater than or equal to x

– x – 1 < x ≤ x ≤ x < x + 1

11
Some Simple Summation
Formulas
n
n( n  1)
• Arithmetic series:  k 1  2  ...  n 
k 1 2
n
x n 1  1
 k 2
x 1  x  x  ...  x n
x 1
• Geometric series: k 0 x 1

1
• Special case: x < 1: x
k 0
k

1 x
n
1 1 1
• Harmonic series:

k 1 k
1 
2
 ... 
n
ln n

 lg k n lg n
• Other important formulae: k 1

n
1
 k p 1p  2 p  ...  n p 
k 1 p 1
n p 1

12
Mathematical Induction
• Used to prove a sequence of statements (S(1), S(2), … S(n))

indexed by positive integers

• Proof:

• Basis step: prove that the statement is true for n = 1

• Inductive step: assume that S(n) is true and prove that S(n+1) is true

for all n ≥ 1

• The key to mathematical induction is to find case n “within” case

n+1
13
Example
• Prove that: 2n + 1 ≤ 2n for all n ≥ 3
• Basis step:
• n = 3: 2  3 + 1 ≤ 23  7 ≤ 8 TRUE
• Inductive step:
• Assume inequality is true for n, and prove it for (n+1):
2n + 1 ≤ 2n must prove: 2(n + 1) + 1 ≤ 2n+1
2(n + 1) + 1 = (2n + 1 ) + 2 ≤ 2n + 2 (by
induction)
 2n + 2n = 2n+1 (2 ≤ 2n for n ≥ 1)
14
Another Example
n
n( n  1)
• Prove that: 
i 1
i
2
for all n 1

• Basis step: 1
1(1  1)
• n = 1:

i 1
i 
2
1

• Inductive step:
• Assume inequality is true for n, and prove it isn true for (n+1):
n
n( n  1) 1
( n  1)( n  2)
 i i  2
• Assume i 1 2 and prove: i 1

n 1 n
n( n  1) ( n  1)( n  2)
 i   i  n  1   ( n  1) 
i 1 i 1 2 2
15
Asymptotic Notations

• A way to describe behavior of functions in the limit


• How we indicate running times of algorithms

• Describe the running time of an algorithm as n grows to 

• O notation: asymptotic “less than”: f(n) “≤” g(n)

•  notation: asymptotic “greater than”: f(n) “≥” g(n)

•  notation: asymptotic “equality”: f(n) “=” g(n)


16
Asymptotic notations
• O-notation

• Intuitively: O(g(n)) = the set of functions with a smaller or

same order of growth as g(n)

• An implicit assumption in order notation is that f(n) is an

asymptotically nonnegative function, i.e.

f(n)0  sufficiently large n

17
Examples
2n2 ≤ cn3  2 ≤ cn  possibly c = 1 and n = 2
0

• 2n2 = O(n3):
n2 ≤ cn2  c ≥ 1  possibly c = 1 and n = 1
0

• n2 = O(n2):

• 1000n2+1000n = O(n2):
1000n2+1000n ≤ cn2 ≤ cn2+ 1000n  possibly

c=1001 and n = 1000


0
n ≤ cn2  cn ≥ 1  possibly

c = 1 and n = 1
0

• n = O(n2): 18
Examples

• E.g.: prove that n2 ≠ O(n)

• Assume  c & n0 such that:  n≥ n0: n2 ≤ cn

• Choose n = max (n0, c)

• n2 = n  n ≥ n  c  n2 ≥ cn

contradiction!!!
19
Comments
• Our textbook uses “asymptotic notation” and
“order notation” interchangeably
• Order notation can be interpreted in terms of set
membership, for e.g.,
• T(n) = O(f(n)) is equivalent to T(n)  O(f(n))
• O(f(n)) is said to be “the set of all functions for which
f(n) is an upper bound”

20
Asymptotic notations (cont.)
•  - notation

• Intuitively: (g(n)) = the set of functions with a larger or

same order of growth as g(n)

21
Examples
• 5n2 = (n)
 c, n such that: 0  cn  5n2
0
Þ Possibly

c = 1 and n = 1  cn  5n2
0

• 100n +Spse
5≠  c,(n 2
) that: 0  cn2  100n + 5
n such
0
100n + 5  100n + 5n ( n  1) = 105n

cn2  105n  n(cn – 105)  0

Since n is positive  cn – 105  0  n  105/c

 contradiction: n cannot be smaller than a constant

• n = (2n), n3 = (n2), n = (logn)


22
Asymptotic notations
(cont.)
• -notation

• Intuitively (g(n)) = the set of functions with the same order

of growth as g(n)

23
Examples
• n2/2 –n/2 = (n2)

• ½ n2 - ½ n ≤ ½ n2 n ≥ 0  c2= ½

• ½ n2 - ½ n ≥ ½ n2 - ½ n * ½ n ( n ≥ 2 ) = ¼ n2  c1= ¼

• n ≠ (n2): c1 n2 ≤ n ≤ c2 n2  only holds for: n ≤ 1/c1

• 6n3 ≠ (n2): c1 n2 ≤ 6n3 ≤ c2 n2  only holds for: n ≤

c2 /6

• n ≠ (logn): c1 logn ≤ n ≤ c2 logn


24
More on Asymptotic
Notations
• There is no unique set of values for n and c in proving the
0

asymptotic bounds

• Prove that 100n + 5 = O(n2)


• 100n + 5 ≤ 100n + n = 101n ≤ 101n2

for all n ≥ 5

n0 = 5 and c = 101 is a solution

• 100n + 5 ≤ 100n + 5n = 105n ≤ 105n2


for all n ≥ 1

n0 = 1 and c = 105 is also a solution

Must find SOME constants c and n0 that satisfy the asymptotic notation relation 25
Asymptotic Notations -
Examples
•  notation
• n2/2 – n/2 = (n2)

• (6n3 + 1)lgn/(n + 1)= (n2lgn)


• n vs. n2 n ≠ (n2)

•  notation • O notation

• n vs. 2n n = (2n) – 2n2 vs. n3 2n2 = O(n3)

• n3 vs. n2 n3 = (n2) – n2 vs. n2 n2 = O(n2)

• n vs. logn n = (logn) – n3 vs. nlogn n3  O(nlgn)

• n vs. n2 n  (n2)

26
Comparisons of Functions
• Theorem:
f(n) = (g(n))  f = O(g(n)) and f = (g(n))
• Transitivity:
• f(n) = (g(n)) and g(n) = (h(n))  f(n) = (h(n))
• Same for O and 
• Reflexivity:
• f(n) = (f(n))
• Same for O and 
• Symmetry:
• f(n) = (g(n)) if and only if g(n) = (f(n))
• Transpose symmetry:
• f(n) = O(g(n)) if and only if g(n) = (f(n))
27
Asymptotic Notations -
Examples
• For each of the following pairs of functions, either f(n) is
O(g(n)), f(n) is Ω(g(n)), or f(n) = Θ(g(n)). Determine which
relationship is correct.
• f(n) = log n2; g(n) = log n + 5 f(n) =  (g(n))

• f(n) = n; g(n) = log n2 f(n) = (g(n))

• f(n) = log log n; g(n) = log n f(n) = O(g(n))

• f(n) = n; g(n) = log2 n f(n) = (g(n))

• f(n) = n log n + n; g(n) = log n f(n) = (g(n))

• f(n) = 10; g(n) = log 10 f(n) = (g(n))

• f(n) = 2n; g(n) = 10n2 f(n) = (g(n))

• f(n) = 2n; g(n) = 3n f(n) = O(g(n))

28
Asymptotic Notations in
Equations
• On the right-hand side
• (n2) stands for some anonymous function in (n2)
2n2 + 3n + 1 = 2n2 + (n) means:
There exists a function f(n)  (n) with the desired
properties such that
2n2 + 3n + 1 = 2n2 + f(n)
• On the left-hand side
2n2 + (n) = (n2)
No matter how the anonymous function is chosen on the
left-hand side, there is a way to choose the anonymous
function on the right-hand side to make the equation valid.

29
Limits and Comparisons of
Functions
Using limits for comparing orders of growth:
0, t (n) has a smaller order of growth than g (n) : t (n)  O ( g (n))
t ( n) 
lim c, t (n) has the same order of growth as g (n) : t (n)  ( g (n))
n   g ( n)
, t (n) has a larger order of growth than g (n) : t (n)  ( g (n))

• compare ½ n (n-1) and n2


1
n ( n  1) 1 n 2
 n 1 1 1
2 
lim 2
 lim 2
 lim  1   
n  n 2 n  n 2 n   n  2

30
Limits and Comparisons of
Functions
L’Hopital rule:
t (n) t ' (n)
lim lim
n  g ( n ) n  g ' (n )

• compare lgn and n


1
 lg e
lg n lg n  n n
lim  lim

 n 
lim  2 lg e lim 0
n  n n  
 
n
1
2 n
n  n

31
Some Common Functions
• Factorials
n
n! k
k 1
n
 n   1 
n! 2n    1     Stirling’s approximation

e   n 
lg( n!) n lg n 

32
Exercises
• Is this true: f(n)=O(g(n))  f(n)+g(n)=W(f(n))
If so, give a proof that uses formal definition of O, W, and
Q

• Order these functions asymptotically from slowest to


fastest:

n lg 2 n,20 lg(5n 4 ), n 2.999 ,51/ lg n , lg( n!) / lg n,


2 0.6 3 lg lg n
n  n ,3 lg n  n , n / lg n, (n  100)!, n
4

33
The Sorting Problem

• Input:

• A sequence of n numbers a1, a2, . . . , an

• Output:

• A permutation (reordering) a1’, a2’, . . . , an’ of the input sequence such

that a1’ ≤ a2’ ≤ · · · ≤ an’, for example, if increasing order is sought

• The order can have varying definitions (not only to do with modulus of

integers)
34
Why Study Sorting
Algorithms?
• There are a variety of situations that we can
encounter
• Do we have randomly ordered keys?
• Are all keys distinct?
• How large is the set of keys to be ordered?
• Need guaranteed performance?

• Various algorithms are better suited to some of


these situations

35
Insertion Sort
• Idea: like sorting a hand of playing cards
• Start with an empty left hand and the cards facing down
on the table.
• Remove one card at a time from the table, and insert it
into the correct position in the left hand
• compare it with each of the cards already in the hand, from
right to left
• The cards held in the left hand are sorted
• these cards were originally the top cards of the pile on the
table

36
INSERTION-SORT
Alg.: INSERTION-SORT(A) 1 2 3 4 5 6 7 8

a a a a a a a a
for j ← 2 to n 1 2 3 4 5 6 7 8

do key ← A[ j ] key

Insert A[ j ] into the sorted sequence A[1 . . j -1]

i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
• Insertion sort – sorts the elements in place
37
Example

38
Loop Invariant for Insertion
Sort 1 2 3 4 5 6 7 8

Alg.: INSERTION-SORT(A) a a a a a a a a
1 2 3 4 5 6 7 8
for j ← 2 to n
do key ← A[ j ] key

Insert A[ j ] into the sorted sequence A[1 . . j -1]

i←j-1
while i > 0 and A[i] > key
do A[i + 1] ← A[i]
i←i–1
A[i + 1] ← key
Loop Invariant (a condition that never changes): at the start of the for loop the elements in A[1 . . j-1] are in

sorted order

39
Proving Loop Invariants
• Initialization (base case):
• It is true prior to the first iteration of the loop

• Maintenance (inductive step):


• If it is true before an iteration of the loop, it remains true before the
next iteration

• Termination:
• When the loop terminates, the invariant - usually along with the
reason that the loop terminated - gives us a useful property that helps
show that the algorithm is correct
• Stop the induction when the loop terminates

• Proving loop invariants works like induction


40
Loop Invariant for Insertion
Sort
• Initialization:
• Just before the first iteration, j = 2:
the subarray A[1 . . j-1] = A[1], (the
element originally in A[1]) – is sorted

41
Loop Invariant for Insertion
Sort
• Maintenance:
• Assume the list A[1,…,j -1] is sorted. Show that at
the end of one loop iteration, A[1,…,j] is also
sorted.
• the while inner loop moves A[j -1], A[j -2], A[j -3], and
so on, by one position to the right until the proper position
for key (which has the value that started out in A[j]) is
found.

42
Loop Invariant for Insertion
Sort
• Maintenance:
• At that point, the value of key is placed into this position.

• Since all elements having been moved to the right side of


the key are already sorted, and the key is in its right position
in that all elements to its left are less than itself, the entire
list A[1,…,j] is also so.

43
Loop Invariant for Insertion
Sort
• Termination:
• The outer for loop ends when j > n (i.e, j = n + 1)
 j-1 = n
• Replace n with j-1 in the loop invariant:
• the subarray A[1 . . n] consists of the elements originally in
A[1 . . n], but in sorted order

j-1 j

• The entire array is sorted!

44
Worst case run-time of Insertion
Sort
INSERTION SORT(A)

1 for j  2 to length[A]

2 do key  A[j]

c1 3 //Insert A[j] into sorted sequence A[1..j  1]

4 ij1

5 while i > 0 and A[i] > key


j 1

c
i 1
2
6 do A[i + 1]  A[i]

7 ii1

8 A[i + 1]  key
c3

(Section 2.2 in textbook)

45
Worst Case Analysis
• Worst case is solicited when the array is reverse-sorted
• Always A[i] > key in while loop test
• Have to compare key with all elements to the left of the j-th position 
so at worst, compare with j-1 elements

j 1
n
  n n n
T (n)   c1   c 2  c3   c1   c2 ( j  1)   c3
j 2  i 1  j 2 j 2 j 2

n(n  1)
T (n) (c1  c3 )( n  1)  c2 (  n)
2
T (n) (n 2 )

46
Best Case Analysis
• Best case happens when the array is already sorted
• A[i] becomes ≤ key upon the first time the while loop test
is run (when i = j -1)

n n
T (n)  c1  c2  c3   (c1  c2  c3 )
j 2 j 2

(c1  c2  c3 )( n  1)
T (n) (n)

47
Average Case Analysis
• Average case: all input permutations are equally likely

n
 1 j 1 
T (n)   c1   c 2  c3 
j 2  2 i 1 
1 n(n  1)
T (n) (c1  c3 )( n  1)  c2 (  n)
2 2
T (n) (n 2 )

48
Insertion Sort
• Advantages
• Good running time for “almost sorted” arrays (n)
• Disadvantages
• (n2) running time in worst and average case

49
Bubble Sort
Alg.: BUBBLESORT(A)
for i  length[A] downto 1
// counting down
for j  1 to i-1
// bubbling up
if A [j] > A [j+ 1]
// if out of order...
exchange A[j]  A[j+1]

52
Example of Bubble Sort
7 2 8 5 4 2 7 5 4 8 2 5 4 7 8 2 4 5 7 8

2 7 8 5 4 2 7 5 4 8 2 5 4 7 8 2 4 5 7 8

2 7 8 5 4 2 5 7 4 8 2 4 5 7 8 (done)

2 7 5 8 4 2 5 4 7 8

2 7 5 4 8

53
Proof of Correctness
• The largest elements are placed at the end, and
once there, they are never moved.
• The variable i starts at the last index in the array and
decreases to 1
• Loop invariant: Every element to the right of i is in the
right place
for all k,j > i, if k < j, then a[k] <= a[j]
• Upon exit, i = 0, so that that all elements of the array are
in the correct place.
• Homework: Prove correctness of the above invariant.

55
8 4 6 9 2 3 1

Selection Sort
• Idea:
• Find the smallest element in the array
• Exchange it with the element in the first position
• Find the second smallest element and exchange it with the
element in the second position
• Continue until the array is sorted
• Invariant (prove it):
• All elements to the left of the current index are in sorted
order and never changed again
• Disadvantage:
• Running time depends only slightly on the amount of order
in the file

56
Selection Sort
Alg.: SELECTION-SORT(A) 8 4 6 9 2 3 1

n ← length[A]
for j ← 1 to n - 1
do smallest ← j
for i ← j + 1 to n
do if A[i] < A[smallest]
then smallest ← i
exchange A[j] ↔ A[smallest]

57
Example
8 4 6 9 2 3 1 1 2 3 4 9 6 8

1 4 6 9 2 3 8 1 2 3 4 6 9 8

1 2 6 9 4 3 8 1 2 3 4 6 8 9

1 2 3 9 4 6 8 1 2 3 4 6 8 9

58
Analysis of Selection Sort
Alg.: SELECTION-SORT(A) cost times

n ← length[A] c
1
1

for j ← 1 to n - 1 c
2
n-1

do smallest ← j c n-1
» n2/2 3
n 1
comparisons
for i ← j + 1 to n c  j 1
(n  j )
4
n 1

do if A[i] < A[smallest] c5


 j 1
(n  j )
»n n 1
exchanges
then smallest ← i c
 j 1
(n  j )
6

exchange A[j] ↔ A[smallest] c n-159


7
Divide-and-Conquer

• Divide the problem into a number of subproblems


• Similar sub-problems of smaller size

• Conquer the sub-problems


• Solve the sub-problems recursively

• Sub-problem size small enough  solve the problems in


straightforward manner

• Combine the solutions to the sub-problems


• Obtain the solution for the original problem
60
Analyzing Divide-and Conquer
Algorithms
• The recurrence is based on the three steps of the
paradigm:
• T(n) – running time on a problem of size n
• Divide the problem originally of size n into a
subproblems: takes D(n)
• Conquer (solve) the a subproblems each of which has size
n/b: takes aT(n/b)
• Combine the solutions back C(n)
(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n)
otherwise

61
Merge Sort Approach
• To sort an array A[p . . r]:
• Divide
• Divide the n-element sequence to be sorted into two
subsequences of n/2 elements each
• Conquer
• Sort the subsequences recursively using merge sort
• When the size of the sequences is 1 there is nothing more
to do
• Combine
• Merge the two sorted subsequences
62
Merge Sort p q r

1 2 3 4 5 6 7 8

Alg.: MERGE-SORT(A, p, r) 5 2 4 7 1 3 2 6

if p < r Check for base case

then q ← (p + r)/2 Divide

MERGE-SORT(A, p, q) Conquer

MERGE-SORT(A, q + 1, r) Conquer

MERGE(A, p, q, r) Combine

• Initial call: MERGE-SORT(A, 1, n)


63
Example – n Power of 2
1 2 3 4 5 6 7 8

Example: divide 5 2 4 7 1 3 2 6 q=4

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

64
Example – n Power of 2
1 2 3 4 5 6 7 8

Example: conquer 1 2 2 3 4 5 6 7

and combine

1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1 2 3 4 5 6 7 8

2 5 4 7 1 3 2 6

1 2 3 4 5 6 7 8

5 2 4 7 1 3 2 6

65
Example – n Not a Power of
2
1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6 q=6

1 2 3 4 5 6 7 8 9 10 11

q=3 4 7 2 6 1 4 7 3 5 2 6 q=9

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 6 1 4 7 3 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

66
Example – n Not a Power of
2
1 2 3 4 5 6 7 8 9 10 11

1 2 2 3 4 4 5 6 6 7 7

1 2 3 4 5 6 7 8 9 10 11

1 2 4 4 6 7 2 3 5 6 7

1 2 3 4 5 6 7 8 9 10 11

2 4 7 1 4 6 3 5 7 2 6

1 2 3 4 5 6 7 8 9 10 11

4 7 2 1 6 4 3 7 5 2 6

1 2 4 5 7 8

4 7 6 1 7 3

67
Merging
p q r

1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

• Input: Array A and indices p, q, r such that p


≤q<r
• Subarrays A[p . . q] and A[q + 1 . . r] are sorted

• Output: One single sorted subarray A[p . . r]

68
Merging
• Idea for merging:
• Two piles of sorted cards
• Choose the smaller of the two top cards
• Remove it and place it in the output pile

• Repeat the process until one pile is empty


• Take the remaining input pile and place it face-down onto
the output pile

69
Merge - Pseudocode p q r

Alg.: MERGE(A, p, q, r) 1 2 3 4 5 6 7 8

2 4 5 7 1 2 3 6

1. Compute n1 and n2
2. Copy the first n1 elements into n
1
n
2

L[1 . . n1 + 1] and the next n2 elements into R[1 . . n2 +


1] p q

3. L[n1 + 1] ← ; R[n2 + 1] ←  L 2 4 5 7 

4. i ← 1; j ← 1
q+1 r

R 1 2 3 6 
5. for k ← p to r
6. do if L[ i ] ≤ R[ j ]
7. then A[k] ← L[ i ]
8. i ←i + 1
9. else A[k] ← R[ j ] 70
Example: MERGE(A, 9, 12,
16) p q r

71
Example: MERGE(A, 9, 12,
16)

72
Example (cont.)

73
Example (cont.)

74
Example (cont.)

Done!

75
Running Time of Merge
• Initialization (copying into temporary arrays):
• (n1 + n2) = (n)

• Adding the elements to the final array (the last for


loop):
• n iterations, each taking constant time  (n)

• Total time for Merge:


• (n)

76
Recall: Analyzing Divide-and Conquer
Algorithms
• The recurrence is based on the three steps of the
paradigm:
• T(n) – running time on a problem of size n
• Divide the problem originally of size n into a
subproblems: takes D(n)
• Conquer (solve) the a subproblems each of which has size
n/b: takes aT(n/b)
• Combine the solutions back C(n)
(1) if n ≤ c
T(n) = aT(n/b) + D(n) + C(n)
otherwise

77
MERGE-SORT Running Time
• Divide:
• compute q as the average of p and r: D(n) = (1)
• Conquer:
• recursively solve 2 subproblems, each of size n/2  2T
(n/2)
• Combine:
• MERGE on an n-element subarray takes (n) time 
C(n) = (n)
(1) if n =1
T(n) = 2T(n/2) + (n) if n > 1

78
Solve the Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Using the Master’s Theorem

T(n) = Θ(nlgn)

79
Master Method
• A “cook-book” type method
• Applicable only to recurrences of the form:
T(n) = a T(n/b) + f(n)
where f(n) is any asymptotically positive function
and a  1 and b > 1 are constants

80
Simplified Master’s Theorem
Method
• When f(n) is a polynomial: f(n) = c nk, three versions
of the solution exist:

1) If a > bk, then T(n) = ( nlogba)


2) If a = bk, then T(n) = ( nk lg n)
3) If a < bk, then T(n) = ( nk )

81
Run-time visited: solve the
Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Using Master’s Theorem?


a = 2  1, b = 2 > 1
f(n) = cn = cnk: for k = 1
Case 2: T(n) = Θ(nlgn)

82
Run-time visited: solve the
Recurrence
T(n) = c if n = 1
2T(n/2) + cn if n > 1

Using Recursion tree method:

83
Recursion tree

Size n

n/2 n/2

n/4 n/4 n/4 n/4


n/2t n/2t n/2t n/2t n/2t n/2t There are O(k) operations
done at this node.

k
k/2 k/2
(Size 1)
Total runtime…

• O(n) steps per level, at every level

• log(n) + 1 levels

• O( n log(n) ) total!

That was the claim!


Algorithm Correctness
• Why we need it?
• Algorithm correctness versus program correctness
• A code is correct if it compiles, runs, exits, and
produces the desired output for all possible input.

86
It works
• Inductive hypothesis:
“In every the recursive call on an array of length at
most i, MERGESORT returns a sorted array.”
• Base case (i=1): a 1-element
array is always sorted.
• Inductive step: Need to show: if
the inductive hypothesis holds
for k<i, then it holds for k=i.
• Aka, need to show that if L and R
are sorted, then MERGE(L,R) is
sorted.
• Conclusion: In the top recursive
call, MERGESORT returns a sorted
array.
Correctness of Merge
• Loop invariant p r

(at the start of the for loop)


• A[p…k-1] contains the k-p smallest elements of
L[1 . . n1 + 1] and R[1 . . n2 + 1] in sorted
order
• L[i] and R[j] are the smallest elements not yet
copied back to A

88
Proof of the Loop Invariant for
MERGE
• Initialization
• Prior to first iteration: k = p
 subarray A[p..k-1] is empty
• A[p..k-1] contains the k – p = 0 smallest elements of
L and R
• L and R are sorted arrays (i = j = 1)
 L[1] and R[1] are the smallest elements in L and R

89
Proof of the Loop Invariant for
MERGE
• Maintenance
• Assume L[i] ≤ R[j]  L[i] is the smallest element not yet
copied to A
• After copying L[i] into A[k], A[p..k] contains the k – p +
1 smallest elements of L and R
• Incrementing k (for loop) and i re-establishes the loop
invariant

90
Proof of the Loop Invariant for
MERGE
• Termination
• At termination k = r + 1
• By the loop invariant: A[p..k-1] = A[p…r] contains the k
– p = r – p + 1 smallest elements of L and R in sorted
order
• Exactly the number of elements to be sorted
k=r+1
 MERGE(A, p, q, r) is correct

91
Merge Sort - Discussion
• Running time insensitive of the input
• Advantages:
• Guaranteed to run in (nlgn)

• Disadvantage
• Requires extra space N

92
Next time
• A more systematic approach to analyzing the
runtime of recursive algorithms.

Before next time


• Pre-Lecture Exercise:
• A few recurrence relations (see website)

You might also like