0% found this document useful (0 votes)
21 views

Sorting Handout

Uploaded by

Vector Tran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views

Sorting Handout

Uploaded by

Vector Tran
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Internal Sorting

Department of Computer Science

HCMC University of Technology, Viet Nam

04, 2014
Outline
Introduction

Sorting is one of the most important concepts and


common applications
Classifying
• Internal Sort: all data in primary memory
• External Sort: big data (not fitted in primary memory)

Properties
• Stability: data with equal key maintain their relative
input order in the output
• Eficiency: number of comparisons and number of
moves
Insertion Sort

• Divide the list into two parts: sorted and unsorted


• For each step, insert the first element in the unsorted
part into suitable position the sorted part

Sorted part Unsorted part


Insertion Sort

• Divide the list into two parts: sorted and unsorted


• For each step, insert the first element in the unsorted
part into suitable position the sorted part

Sorted part Unsorted part


Insertion Sort

• Divide the list into two parts: sorted and unsorted


• For each step, insert the first element in the unsorted
part into suitable position the sorted part

Sorted part Unsorted part


Example

9 9 8 6 6 6 6 6
12 12 9 8 8 8 8 8
8 8 12 9 9 9 9 9
6 6 6 12 12 12 12 12
22 22 22 22 22 15 15 15
15 15 15 15 15 22 22 22
30 30 30 30 30 30 30 25
25 25 25 25 25 25 25 30
Insertion Sorting Implementation

Input: Unsorted array arr


Output: Sorted array arr
1 for (i = 1; i < n ; i++) {
2 tmp = arr[i];
3 for (j = i-1;
4 j >= 0 && tmp < arr[j];
5 j--)
6 arr[j+1]=arr[j];
7 arr[j+1]=tmp;
8 }
Selection Sort

• Divide the list into two parts: sorted and unsorted


• For each step, select the smallest/largest element in
the unsorted part and place it to the sorted part

Sorted part Unsorted part


Selection Sort

• Divide the list into two parts: sorted and unsorted


• For each step, select the smallest/largest element in
the unsorted part and place it to the sorted part

smallest

Sorted part Unsorted part


Selection Sort

• Divide the list into two parts: sorted and unsorted


• For each step, select the smallest/largest element in
the unsorted part and place it to the sorted part

Sorted part Unsorted part


Example

9 6 6 6 6 6 6 6
12 12 8 8 8 8 8 8
8 8 12 9 9 9 9 9
6 9 9 12 12 12 12 12
22 22 22 22 22 15 15 15
15 15 15 15 15 22 22 22
30 30 30 30 30 30 30 25
25 25 25 25 25 25 25 30
Selection Sorting Implementation

Input: Unsorted array arr


Output: Sorted array arr
1 for (i=0 ; i<n-1 ; i++) {
2 lowindex = i;
3 for (j=i+1 ; j<n ; j++)
4 if (arr[j] < arr[lowindex])
5 lowindex = j;
6 swap(arr,i,lowindex);
7 }
Bubble Sort

• Divide the list into two parts: sorted and unsorted


• For each step, the smallest/largest in the unsorted
part is bubbled toward the sorted part

Sorted part Unsorted part


Bubble Sort

• Divide the list into two parts: sorted and unsorted


• For each step, the smallest/largest in the unsorted
part is bubbled toward the sorted part

Sorted part Unsorted part


Bubble Sort

• Divide the list into two parts: sorted and unsorted


• For each step, the smallest/largest in the unsorted
part is bubbled toward the sorted part

Sorted part Unsorted part


Example

9 6 6 6
12 9 8 8
8 12 9 9
6 8 12 12
22 15 15 15
15 22 22 22
30 25 25 25
25 30 30 30
Shell Sort

• Invented by Donald L.Shell (1995)


• Also called diminishing-increment sort
• Divide data into K segments (or increment)
• These segments are dispersed through out the data

0 1 2 3 4 5 6 7 8 9

k=3

0 0+k 0+2k 0+3k


Segment 1

1 1+k 1+2k
Segment 2

2 2+k 2+2k
Segment 3
Shell Sorting Algorithm
• In each iteration, sort K segments using insertion sort
• Reduce K after each iteration until K == 1
Step 0 k=5 Sorted Seg. Combine

24 24 6 6
12 12 5 5
14 14 2 2
7 7 1 1
22 22 9 9
6 6 8 8
5 5 12 12
16 16 14 14
34 34 7 7
9 9 22 22
8 8 24 24
62 62 62 62
2 2 16 16
1 1 34 34

Step 1 k=3 Sorted Seg. Combine


Shell Sorting Implementation

1 for (k=first_incremental_value;
2 k>=1;
3 k=next_incremental_value)
4 for (segment=0;
5 segment<k;
6 segment++)
7 segmentSort(segment,k);
segmentSort(int segment,int k)

1 for (current=segment+k;
2 current<size;
3 current=current+k) {
4 tmp = data[current];
5 for (walker=current-k;
6 walker>=0
7 && tmp < data[walker];
8 walker=walker-k)
9 data[walker+k]=data[walker];
10 data[walker+k]=tmp;
11 }
Shell Sort Discussion

• incremental values should not be multiple of each


other
• (2k+1): 1, 3, 7, 15, 31,. . .
• (3k+1): 1, 4, 13, 40, . . .
• Time complexity through experiments: O(n1.25 )
Heap Sort

• Build a heap for data


• For each step, take the root of the heap and put it in
the sorted part

8 4

12 10 7 6
Heap Sort

• Build a heap for data


• For each step, take the root of the heap and put it in
the sorted part

8 7 2 4

12 10
Heap Sort

• Build a heap for data


• For each step, take the root of the heap and put it in
the sorted part

8 10 2 4 6

12
Example

heap swap reheap

12 6 10 7 8 4 7 2 6
10 10 6 6 6 6 6 6 2
8 8 8 8 7 7 4 4 4
2 2 2 2 2 2 2 7 7
4 4 4 4 4 8 8 8 8
7 7 7 10 10 10 10 10 10
6 12 12 12 12 12 12 12 12
Divide-And-Conquer Meta Algorithm

1 Partition data into many parts


2 for (each part)
3 Divide-And-Conquer on each part
4 Combine many resulted parts
Merge Sort

8 4 12 6 33 42 16 7

• Partition the array into two parts

8 4 12 6 33 42 16 7

• Sort two parts (using rescursive or iterative)

4 6 8 12 7 16 33 42

• Merge two ordered parts

4 6 7 8 12 16 33 42
Quick Sort
8 4 12 6 33 42 16 7 5

• Based on the pivot, partition the array into three


parts: less than, pivot, and greater than or equal to.

4 6 7 5 8 12 33 42 16

• Sort the first and the last parts (using rescursive or


iterative)

4 5 6 7 8 12 16 33 42

• Append three ordered parts

4 5 6 7 8 12 16 33 42
Pivot Selection

• C. A. Hoare (1962): the first element


• Simple
• Unbalanced parts
• R. C. Singleton (1969): the median of the first, last
and the middle elements
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }

left right
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }

< >= < >=

left right
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }

< >= < >=

left right
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }

< < >= >=

left right
Partition

1 i n t p a r t i t i o n ( i n t key [ ] , i n t l e f t , i n t r i g h t , i n t p i v o t ) {
2 do {
3 while ( key [++ l e f t ] < p i v o t ) ;
4 while ( ( l e f t < r i g h t ) && key[−− r i g h t ] >= p i v o t ) ;
5 myswap ( key , l e f t , r i g h t ) ;
6 } while ( l e f t < r i g h t ) ;
7 return l e f t ;
8 }

< >= >=

left right
Radix-Exchange Sort

7 4 1 6 3 2 5 0
1 11
111 100
1 00 011
0 11 110
1 10 011
0 11 010
0 10 101
1 01 000
0 00

• Based on the bit representation of the keys


• For each step, based on the corresponding bit,
partition the array into two parts: bit == 1 and bit == 0.
1 3 2 0 7 4 6 5
0 01 0 11 0 10 0 00 1 11 1 00 1 10 1 01

• Sort these two parts (using rescursive on the next


bits or iterative)
0 1 2 3 4 5 6 7

• Append two ordered parts

0 1 2 3 4 5 6 7
Example

7 111 011 001 000 0


4 100 000 000 001 1
2 010 010 010 010 2
6 110 001 011 011 3
5 101 101 101 100 4
1 001 110 100 101 5
0 000 100 110 110 6
3 011 111 111 111 7
Radix-Exchange Sorting Implementation

1 void radixSort(int[] keys,int l, int r,


int mask)
2 int i = l,j = r;
3 if (r<=l mask==0) return;
4 while (j!=i) {
5 while ((a[i] & mask==0) && (i<j)) i++;
6 while ((a[j] & mask==1) && (j>i)) j--;
7 swap(a,i,j);
8 }
9 radixSort(a,l,j-1,mask>>1);
10 radixSort(a,j,r,mask>>1);
11 }
Quick Sort vs. Radix-Exchange Sort

Similarities
• partition array
• sort subarray recursively

Differences
• Partitioning Method
• RE partitions array based on the bit at corresponding
position
• Q partitions array based on the pivot value
• Time complexity
• RE: O(bn)
• Q: O(nlog2 n)
Empirical Comparison

Sort 10 100 1K 10K 100K 1M Up Down


Insertion .00023 .007 0.66 64.98 7381.0 674420 0.04 129.05
Bubble .00035 .020 2.25 277.94 27691.0 2820680 70.64 108.69
Selection .00039 .012 0.69 72.47 7356.0 780000 69.76 69.58
Shell .00034 .008 0.14 1.99 30.2 554 0.44 0.79
Shell/O .00034 .008 0.12 1.91 29.0 530 0.36 0.64
Merge .00050 .010 0.12 1.61 19.3 219 0.83 0.79
Merge/O .00024 .007 0.10 1.31 17.2 197 0.47 0.66
Quick .00048 .008 0.11 1.37 15.7 162 0.37 0.40
Quick/O .00031 .006 0.09 1.14 13.6 143 0.32 0.36
Heap .00050 .011 0.16 2.08 26.7 391 1.57 1.56
Heap/O .00033 .007 0.11 1.61 20.8 334 1.01 1.04
Radix/4 .00838 .081 0.79 7.99 79.9 808 7.97 7.97
Radix/8 .00799 .044 0.40 3.99 40.0 404 4.00 3.99

Table: Running time in miliseconds


Summary

• Internal Sort requires all elements available on the


memory
• Insertion, Selection and Bubble Sort are simple but
bad performance
• Shell, Merge, Quick and Heap Sort are more complex
but good performance
• Radix Sort is not based on the value of keys but on
their radix

You might also like