SWEN 5012
ADVANCED ALGORITHMS
AND PROBLEM SOLVING
LECTURE 2 SORTING
Beakal Gizachew
Based on slides of Serdar Taşıran, Shafi Goldwasser, Erik Demaine, and Alptekin Küpçü
SORTING
• Sorting
• Input: a sequence of n numbers <a1,a2,…,an>
• Output: a re-ordering <a’1,a’2,…,a’n> of the sequence such that
a’1 a’2 … a’n
• Example:
• Input: 8 2 4 9 3 6
• Output: 2 3 4 6 8 9
• Called an instance of the problem
• Check these demos:
• http://www.iti.fh-flensburg.de/lang/algorithmen/sortieren/sortcontest/sortcontest.htm
• http://www.cs.oswego.edu/~mohammad/classes/csc241/samples/sort/Sort2-E.html
• http://www.cs.ubc.ca/~harrison/Java/sorting-demo.html
• http://www.cs.pitt.edu/~kirk/cs1501/animations/Sort3.html
2
INSERTION SORT
• Takes array A[1..n] containing a sequence of length n to be sorted
• Sorts the array in place
• Numbers re-arranged inside A with at most a constant number of
them stored outside.
• O(1) extra space
3
INSERTION SORT
INSERTION-SORT (A, n) ⊳ A[1 . . n]
for j ← 2 to n
do key ← A[ j ]
i←j–1
“pseudocode”
while i > 0 and A[ i ] > key
do A[i+1] ← A[ i]
i←i–1
A[i+1] = key
1 i j n
A:
key
sorted
4
EXAMPLE OF INSERTION SORT
8 2 4 9 3 6
2 8 4 9 3 6
2 4 8 9 3 6
2 4 8 9 3 6
2 3 4 8 9 6
2 3 4 6 8 9 done
5
CORRECTNESS OF INSERTION
SORT
INSERTION-SORT (A, n) ⊳ A[1 . . n]
for j ← 2 to n
do key ← A[ j ]
i←j–1
while i > 0 and A[ i ] > key
do A[i+1] ← A[ i]
i←i–1
A[i+1] = key
• Loop invariant:
• At the start of each iteration, the subarray A[1..j-1] containsthe
elements originally in A[1..j-1] but in sorted (increasing) order
• Prove initialization, maintenance, termination
• Invariant must imply interesting property about algorithm
6
RUNNING TIME
• The running time of insertion sort depends on the input: e.g., an
already sorted sequence is easier to sort.
• Major Simplifying Convention: Parameterize the running time by
the size of the input, since short sequences are easier to sort than
long ones.
➢ TA(n) = time of A when run with inputs of length n
➢ n: Number of bits required to encode input
➢ Ignore machine-dependent constants
➢ Look at growth of T(n) as n → ∞
➢ For small inputs, the algorithm will run fast anyway. Important
thing is for how big inputs we can still run the algorithm given a
reasonable amount of time.
7
KINDS OF ANALYSES
• Worst-case: (mostly-used)
• T(n) = maximum time of algorithm on any input of size n.
• Generally, we seek upper bounds on the running time, to have a
guarantee of performance.
• Average-case: (sometimes used)
• T(n) = expected time of algorithm over all inputs of size n.
• Need assumption of statistical distribution of real inputs.
• Best-case: (almost never used)
• T(n) = best possible time of algorithm over any input of size n.
• Cheat with a slow algorithm that works fast on some input.
• May be useful when proving negative results
• e.g., Even the best case of algorithm X is slower than the worst
case of algorithm Y.
8
RUNNING-TIME OF INSERTION
SORT
INSERTION-SORT (A, n) ⊳ A[1 . . n] COST TIMES
for j ← 2 to n c1 n
do key ← A[ j ] c2 n-1
i←j–1 c3 n-1
while i > 0 and A[ i ] > key c4 σ 𝑛𝑗=2 𝑡𝑗
do A[i+1] ← A[ i] c5 σ 𝑛𝑗=2 (𝑡𝑗−1)
i←i–1 c6 σ 𝑛𝑗=2 (𝑡𝑗− 1)
A[i+1] = key c7 n-1
Let T(n) = running time of INSERTION-SORT
T(n) = c1n + c2 (n-1) + c3 (n-1) + c4 σ 𝑛𝑗=2 𝑡+
𝑗 c 5 σ 𝑗=2
𝑛
(𝑡 𝑗 − 1) +c6 σ 𝑗=2
𝑛
(𝑡𝑗− 1)+c8 (n-1)
The running time depends on the values of tj . This vary according to the input.
9
INSERTION SORT ANALYSIS
Worst case: Input reverse sorted.
T(n) = σ 𝒏 Θ(𝒋) = Θ(n2) [arithmetic series]
𝒋=𝟐
Average case: All permutations equally likely.
T(n) = σ 𝒏 Θ(𝒋ൗ 𝟐) = Θ(n2)
𝒋=𝟐
Best case: Input already sorted. O(n)
Is insertion sort a fast sorting algorithm?
• Moderately so, for small n.
• Not at all, for large n.
10
HOW TO DESIGN ALGORITHMS?
• We’ll see many example styles throughout the semester
• Insertion sort was an example of an “incremental algorithm”
• Another common paradigm: Divide and Conquer
• Divide into simpler/smaller sub-problems
• Solve (conquer) sub-problems recursively
• Combine results of sub-problems
11
MERGE SORT
MERGE-SORT A[1 . . n]
If n = 1, return
Recursively sort
A[ 1 . . n/2 ] and A[ n/2+1 . . n ]
Merge the two sorted lists
Key subroutine: MERGE
12
MERGE SORT EXAMPLE
13
MERGING TWO SORTED ARRAYS
20 12
13 11
7 9
2 1
14
MERGING TWO SORTED ARRAYS
20 12
13 11
7 9
2 1
15
MERGING TWO SORTED ARRAYS
20 12 20 12
13 11 13 11
7 9 7 9
2 1 2
16
MERGING TWO SORTED ARRAYS
20 12 20 12
13 11 13 11
7 9 7 9
2 1 2
1 2
17
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12
13 11 13 11 13 11
7 9 7 9 7 9
2 1 2
1 2
18
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12
13 11 13 11 13 11
7 9 7 9 7 9
2 1 2
1 2 7
19
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7
20
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9
21
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9
22
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11
23
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11
24
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11 12
25
MERGING TWO SORTED ARRAYS
20 12 20 12 20 12 20 12 20 12 20 12
13 11 13 11 13 11 13 11 13 11 13
7 9 7 9 7 9 9
2 1 2
1 2 7 9 11 12
Time = O(n) to merge a total of n
elements (linear time).
26
ANALYZING MERGE SORT
T(n) MERGE-SORT A[1 . . n]
O(1) If n = 1, return
2T(n/2) Recursively sort
A[ 1 . . n/2 ] and A[ n/2+1 . . n ]
O(n) Merge the two sorted lists
Sloppiness: Should be T( n/2 ) + T( n/2 ) , but it
turns out this does not matter asymptotically.
27
RECURRENCE FOR MERGE SORT
O(1) if n = 1
T(n) =
2T(n/2) + O(n) if n > 1
Note: Usually the base case T(1) = O(1)
28
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
29
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
T(n)
30
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
T(n/2) T(n/2)
31
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn
cn/2 cn/2
T(n/4) T(n/4) T(n/4) T(n/4)
32
RECURSION TREE
Solve T(n) = 2T(n/2) + cn, where c > 0 is constant.
cn cn
cn/2 cn/2 cn
h = log n cn/4
cn/4 cn/4 cn/4 cn
…
O(1) #leaves = n O(n)
Total = O(n log n)
33
CONCLUSIONS
• O(n log n) grows more slowly than O(n2).
• Therefore, merge sort asymptotically beats insertion sort in the
worst case.
• In practice, merge sort beats insertion sort for n > 30 or so.
34
MASTER THEOREM
• Let T(n) = a T(n/b) + f(n) where
• a 1, b > 1 are constants
• f(n) is an asymptotically positive function
• Then, T(n) can be bounded asymptotically as follows
• If f(n)=O(𝒏𝐥𝐨𝐠𝒃 𝒂−𝜺) for some > 0, then T(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 )
• If f(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 ) then T(n)=Θ(𝒏𝐥𝐨𝐠𝒃 𝒂 log n)
• If f(n)=Ω(𝒏𝐥𝐨𝐠𝒃 𝒂+ɛ) for some > 0, and if, for all sufficiently large n
and a constant c < 1 we have af(n/b) cf(n) , then T(n)=Θ(f(n))
35