04 Sorting
04 Sorting
& Algorithms
Week 4
Sorting
The sorting problem (w/ numbers)
Input: a sequence of n numbers ‹a1, a2, …, an›
Output: a permutation of the input such that ‹ai1 ≤ … ≤ ain›
8 1 6 4 0 3 9 5 ➨ 0 1 3 4 5 6 8 9
Numbers ≈ Keys
12 9 3 7 14 11 3 9 12 7 14 11 3 7 12 9 14 11
3 7 9 12 14 11 3 7 9 11 14 12 3 7 9 11 12 14
Selection-Sort(A, n)
Input: an array A and the number n of elements in A to sort
Output: the elements of A sorted into non-decreasing order
1. For i = 0 to n-2:
A. Set smallest to i
B. For j = i + 1 to n-1
i. If A[j] < A[smallest], then set smallest to j
C. Swap A[i] with A[smallest]
Selection Sort
Selection-Sort(A, n)
Input: an array A and the number n of elements in A to sort
Output: the elements of A sorted into non-decreasing order
1. For i = 0 to n-2:
A. Set smallest to i
B. For j = i + 1 to n-1
i. If A[j] < A[smallest], then set smallest to j
C. Swap A[i] with A[smallest]
Correctness? Loop invariant! (inner first and use it for the outer loop)✔
Running time?
Quiz
What is the running time of Selection Sort?
A: O(log n)
B: O(n)
C: O(n log n)
D: O(n2)
Insertion Sort
Like sorting a hand of playing cards:
◼ start with empty left hand, cards on table
◼ pick cards one by one, insert into
correct position
◼ to find position, compare to cards in hand
from right to left (i.e., largest, 2nd largest, etc.)
◼ cards in hand are always sorted
Insertion Sort is
◼ a good algorithm to sort a small number of elements
◼ an incremental algorithm
Incremental algorithms process the input elements one-by-one and maintain the solution for
the elements processed so far.
Incremental algorithms
Incremental algorithms process the input elements one-by-one and maintain the solution
for the elements processed so far.
In pseudocode:
IncrementalAlgo(A)
// incremental algorithm which computes the solution of a
problem with input A = {x0,…,xn-1}
j n
1 3 14 17 28 6 …
Correctness proof
Loop invariant
At the start of each iteration of the “outer” for loop
(indexed by j) the subarray A[0:j] consists of the
elements originally in A[0:j] but in sorted order.
Correctness proof
InsertionSort(A) Loop invariant
1. for j = 1 to A.length-1 At the start of each iteration of the
2. do key = A[j] “outer” for loop (indexed by j) the
3. i = j -1 subarray A[0:j] consists of the
4. while i > 0 and A[i] > key elements originally in A[0:j] but in
5. do A[i+1] = A[i] sorted order.
6. i = i -1
7. A[i +1] = key
Initialization
Just before the first iteration, j = 1 ➨ A[0:j] = A[0], which is the element originally in A[0],
and it is trivially sorted.
Correctness proof
InsertionSort(A) Loop invariant
1. for j = 1 to A.length-1 At the start of each iteration of the
2. do key = A[j] “outer” for loop (indexed by j) the
3. i = j -1 subarray A[0:j] consists of the
4. while i > 0 and A[i] > key elements originally in A[0:j] but in
5. do A[i+1] = A[i] # we shift to the right sorted order.
6. i = i -1
7. A[i +1] = key
Maintenance
Strictly speaking, we need to prove a loop invariant for “inner” while loop. Instead, let's simply
assume that body of while loop moves A[j-1], A[j-2], A[j-3], and so on, by one position to the
right until position of key is found (which has value of A[j]).
➨ invariant maintained.
Correctness proof
InsertionSort(A) Loop invariant
1. for j = 1 to A.length-1 At the start of each iteration of the
2. do key = A[j] “outer” for loop (indexed by j) the
3. i = j -1 subarray A[0:j] consists of the
4. while i > 0 and A[i] > key elements originally in A[0:j] but in
5. do A[i+1] = A[i] sorted order.
6. i = i -1
7. A[i +1] = key
Termination
The outer for loop ends when j > n-1; this is when j = n.
Plug n for j in the loop invariant
➨ the subarray A[0:n] (=A) consists of the elements originally in
A[0:n] (=A) in sorted order.
Quiz
Insertion Sort is in-place. Is Selection Sort also in-place?
A: yes
B: no
Analysis T(n) of Insertion Sort
Insertion-Sort(A)
1. initialize: sort A[0]
2. for j = 1 to A.length-1
3. do key = A[j]
4. i = j -1
5. while i > -1 and A[i] > key
6. do A[i+1] = A[i]
7. i = i -1
8. A[i +1] = key
Upper bound: Let T(n) be the worst case running time of InsertionSort
on an array of length n. We have
n−1 n−1
Divide
the problem into a number of subproblems that are smaller instances of the same problem.
Conquer
the subproblems by solving them recursively. If they are small enough, solve the
subproblems as base cases.
Combine / Merge
the solutions to the subproblem into the solution for the original problem.
Divide-and-conquer
D&CAlg(A)
# divide-and-conquer algorithm that computes the solution of a
problem with input A = {x0,…,xn-1}
1. if # elements of A is small enough (for example 1)
2. then compute Sol (the solution for A) brute-force
3. else
4. split A in, for example, 2 non-empty subsets A1 and A2
5. Sol1 = D&CAlg(A1)
6. Sol2 = D&CAlg(A2)
7. compute Sol (the solution for A) from Sol1 and Sol2
8. return Sol
Merge Sort
Merge-Sort(A)
// divide-and-conquer algorithm that sorts array A[0:n]
1. if A.length = 1
2. then compute Sol (the solution for A) brute-force
3. else
4. split A in 2 non-empty subsets A1 and A2
5. Sol1 = Merge-Sort(A1)
6. Sol2 = Merge-Sort(A2)
7. compute Sol (the solution for A) from Sol1 and Sol2
Merge Sort
Merge-Sort(A)
// divide-and-conquer algorithm that sorts array A[0:n]
1. if A.length = 1
2. then skip
3. else
4. n = A.length ; n1 = n/2 ; n2 = n/2 ;
copy A[0:n1] to auxiliary array A1[0:n1]
copy A[n1:n] to auxiliary array A2[0:n2]
5. Merge-Sort(A1)
6. Merge-Sort(A2)
7. Merge(A, A1, A2)
Reminder: ⌊x⌋ is called the floor of x and ⌈x⌉ is called the ceiling of x. E.g., ⌊3.1⌋ = 3 and ⌈3.1⌉ = 4
Merge Sort
3 14 1 28 17 8 21 7 4 35
1 3 4 7 8 14 17 21 28 35
3 14 1 28 17 8 21 7 4 35
1 3 14 17 28 4 7 8 21 35
3 14 1 28 17
3 14 1 17 28
3 14
Merge Sort
Merging
A1 1 3 14 17 28 A2 4 7 8 21 35
A 1 3 4 7 8 14 17 21 28 35
1. The Base Case. Prove that the proposition holds when n=1
3. Conclusion. Conclude that the proposition holds for every natural number n
Because proposition holds when n=1, IH implies that it must hold when n=2, and
therefore it must hold when n=3, and when n=4, and so forth, for every natural number n
Proof by Strong induction
Related to proof by (simple) induction:
1. The Base Case. Prove that the proposition holds when n=1
3. Conclusion. Conclude that the proposition holds for every natural number n
Because proposition holds when n=1, IH implies that it must hold when n=2, and
therefore it must hold when n=3, and when n=4, and so forth, for every natural number n
Quiz
Insertion Sort is in-place. Is Merge Sort also in-place?
A: yes
B: no
Analysis of Merge Sort
Merge-Sort(A)
// divide-and-conquer algorithm that sorts array A[0:n]
1. if A.length = 1 O(1)
2. then skip
3. else
4. n = A.length ; n1 = n/2 ; n2 = n/2 ; O(1)
copy A[0:n1] to auxiliary array A1[0:n1] O(n)
copy A[n1:n] to auxiliary array A2[0:n2] O(n)
5. Merge-Sort(A1) ??
6. Merge-Sort(A2) ??
7. Merge(A, A1, A2) O(n)
T( n/2 ) + T( n/2 )
O(1) if n = 1
T(n) =
T( n/2 ) + T( n/2 ) + Θ(n) if n > 1
An in-place algorithm uses only O(1) memory, in addition to the original input
Quiz
What is the running time of the following Mixed-Sort(A) algorithm?
A: Θ(n log n)
B: Θ(n2)
Solving recurrences
Solving recurrences
Alternatively:
2. analyze recursion tree
3. substitution method: guess solution and use induction to prove that guess is correct
Then we have:
1. If 𝑓(𝑛) = 𝑂(𝑛(log𝑏 𝑎)−𝜀 ) for some constant 𝜀 > 0, then 𝑇(𝑛) = Θ(𝑛log𝑏 𝑎 ).
2. If 𝑓(𝑛) = Θ(𝑛log𝑏 𝑎 ), then 𝑇(𝑛) = Θ(𝑛log𝑏 𝑎 log 𝑛)
Case 3 of the master theorem gives 𝑇(𝑛) = Θ(𝑛3), if the regularity condition holds.
1
choose 𝑐 = and 𝑛0 = 1
2
➨ 𝑇(𝑛) = Θ(𝑛3)
Quiz
Recurrence case ?
𝑛
1. 𝑇 𝑛 = 2𝑇 + Θ 𝑛2 3 𝑇 𝑛 = Θ 𝑛2
2
𝑛
2. 𝑇 𝑛 = 4𝑇 2
+Θ 𝑛 1 𝑇(𝑛) = Θ(𝑛2)
𝑛
3. 𝑇 𝑛 = 9𝑇 + Θ 𝑛2 2 𝑇(𝑛) = Θ(𝑛2 log 𝑛)
3
Expanding a recurrence
Solving recurrences
Alternatively:
2. analyze recursion tree
3. substitution method: guess solution and use induction to prove that guess is correct
𝑇 𝑛−2 +Θ 1 𝑇 𝑛−3 +Θ 1
3. Use (meaningful) base case + derive general recurrence relation
𝑇 𝑛 = 𝑇 1 + Θ 1 + ⋯+ Θ 1 = Θ 1 + 𝑛 − 1 Θ 1 = 𝑛 Θ 1 = Θ 𝑛
𝑛 − 1 times
Expanding a recurrence
Solve 𝑇 𝑛 = 𝑇 𝑛/2 + Θ 1 ? (i.e, Binary Search)
This gets a bit messy, better way to expand this: a recursion tree.
Quiz
What does 𝑇 𝑛 = 𝑇 𝑛/2 + 𝑛 solve to?
A: Θ log 𝑛
B: Θ 𝑛
C: Θ 𝑛 log 𝑛
Recursion Trees
Recursion trees
𝑇(𝑛) = 2𝑇(𝑛/2) + 𝑛
𝑛/2 𝑛/2
𝑛 𝑛
𝑛2
(𝑛/2)2 (𝑛/2)2
𝑛2 𝑛2
𝑛 𝑛
1. If 𝑓(𝑛) = 𝑂(𝑛(log𝑏 𝑎)−𝜀 ) for some constant 𝜀 > 0, then 𝑇(𝑛) = Θ(𝑛log𝑏 𝑎 ).
bottom level of recursion tree dominates
Alternatively:
✔ 2. analyze recursion tree
3. substitution method: guess solution and use induction to prove that guess is correct
✔
How to guess: expand the recursion (backward substitution)
The substitution method
Claim: 𝑇(𝑛) = 𝑂(𝑛 log 𝑛)
Proof: by induction on 𝑛
to show: there are constants 𝑐 and 𝑛0 such that
𝑇(𝑛) ≤ 𝑐 𝑛 log 𝑛 for all 𝑛 ≥ 𝑛0
𝑛 = 1 ➨ 𝑇(1) = 2 ≤ 𝑐 1 log 1 ➨ 𝑛0 = 2
𝑛 = 𝑛0 = 2 is a base case
Need more base cases? ⌊3/2⌋ = 1, ⌊4/2⌋ = 2 ➨ 3 must also be base case
Base cases:
𝑛 = 2: 𝑇 2 = 2𝑇 1 + 2 = 2 ∙ 2 + 2 = 6 = 𝑐 2 log 2 for 𝑐 = 3
𝑛 = 3: 𝑇(3) = 2𝑇(1) + 3 = 2 ∙ 2 + 3 = 7 ≤ 𝑐 3 log 3
The substitution method
Claim: 𝑇(𝑛) = 𝑂(𝑛 log 𝑛)
Proof: by induction on 𝑛
to show: there are constants 𝑐 and 𝑛0 such that
𝑇(𝑛) ≤ 𝑐 𝑛 log 𝑛 for all 𝑛 ≥ 𝑛0
choose 𝑐 = 3 and 𝑛0 = 2
𝑛
Inductive step: 𝑛 > 3 𝑇 𝑛 = 2𝑇 + 𝑛
2
≤ 2 𝑐 𝑛/2 log 𝑛/2 + 𝑛 (induction hypothesis)
≤ 𝑐 𝑛 ((log 𝑛) − 1) + 𝑛
≤ 𝑐 𝑛 log 𝑛 ■
Tips for analyzing running time
Analysis of recursive algorithms:
find the recursion and solve with master theorem if possible
𝑛 1
𝑖 = 𝑛(𝑛 + 1) = Θ(𝑛2)
𝑖=1 2
𝑛 1
𝑖 = 𝑛(𝑛 + 1)(2𝑛 + 1) = Θ(𝑛3 )
2
𝑖=1 6
Solving recurrences
Alternatively:
2. analyze recursion tree
3. substitution method: guess solution and use induction to prove that guess is correct