CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Sorting (II) Reading: Chap.7, Weiss
CSCE 3110 Data Structures & Algorithm Analysis: Rada Mihalcea Sorting (II) Reading: Chap.7, Weiss
Today
Quick Review
Divide and Conquer paradigm Merge Sort Quick Sort
Divide-and-Conquer
Divide and Conquer is a method of algorithm design. This method has three distinct steps: Divide: If the input size is too large to deal with in a straightforward manner, divide the data into two or more disjoint subsets. Recur: Use divide and conquer to solve the subproblems associated with the data subsets. Conquer: Take the solutions to the subproblems and merge these solutions into a solution for the original problem.
Merge-Sort
Algorithm:
Divide: If S has at leas two elements (nothing needs to be done if S has zero or one elements), remove all the elements from S and put them into two sequences, S1 and S2, each containing about half of the elements of S. (i.e. S1 contains the first n/2 elements and S2 contains the remaining n/2 elements. Recur: Recursive sort sequences S1 and S2. Conquer: Put back the elements into S by merging the sorted sequences S1 and S2 into a unique sorted sequence.
Merge-Sort Example
Quick-Sort
1) Divide : If the sequence S has 2 or more elements, select an element x from S to be your pivot. Any arbitrary element, like the last, will do. Remove all the elements of S and divide them into 3 sequences:
L, holds Ss elements less than x E, holds Ss elements equal to x G, holds Ss elements greater than x
2) Recurse: Recursively sort L and G 3) Conquer: Finally, to put elements back into S in order, first inserts the elements of L, then those of E, and those of G.
2) Divide: rearrange elements so that x goes to its final position E 3) Recurse and Conquer: recursively sort
Quick-Sort Tree
In-Place Quick-Sort
Divide step: l scans the sequence from the left, and r from the right.
A swap is performed when l is at an element larger than the pivot and r is at one smaller than the pivot.
General case:
Time spent at level i in the tree is O(n) Running time: O(n) * O(height)
Average case:
O(n log n)
Bucket Sort
Bucket sort
Assumption: the keys are in the range [0, N) Basic idea:
1. Create N linked lists (buckets) to divide interval [0,N) into subintervals of size 1 2. Add each input element to appropriate bucket 3. Concatenate the buckets
Bucket Sort
Each element of the array is put in one of the N buckets
Bucket Sort
Now, pull the elements from the buckets into the array
Radix Sort
How did IBM get rich originally? Answer: punched card readers for census tabulation in early 1900s.
In particular, a card sorter that could sort cards into different bins
Each column can be punched in 12 places (Decimal digits use only 10 places!)
Radix Sort
Intuitively, you might sort on the most significant digit, then the second most significant, etc. Problem: lots of intermediate piles of cards to keep track of Key idea: sort the least significant digit first
RadixSort(A, d) for i=1 to d StableSort(A) on digit i
Radix Sort
Can we prove it will work? Inductive argument:
Assume lower-order digits {j: j<i}are sorted Show that sorting next digit i leaves array correctly sorted
If two digits at position i are different, ordering numbers by that digit is correct (lower-order digits irrelevant) If they are the same, numbers are already sorted on the lower-order digits. Since we use a stable sort, the numbers stay in the right order
Radix Sort
What sort will we use to sort on digits? Bucket sort is a good choice:
Sort n numbers on digits that range from 1..N Time: O(n + N)
Each pass over n numbers with d digits takes time O(n+k), so total time O(dn+dk)
When d is constant and k=O(n), takes O(n) time
Radix Sort
In general, radix sort based on bucket sort is
Asymptotically fast (i.e., O(n)) Simple to code A good choice
Each pass over n numbers with 1 digit takes time O(n+k), so total time O(dn+dk)
When d is constant and k=O(n), takes O(n) time