0% found this document useful (0 votes)
2 views

CNG213 Lecture 3 - Sorting-Searching

The document discusses sorting algorithms, including selection sort, insertion sort, bubble sort, merge sort, and quicksort, detailing their processes, implementations, and complexities. It emphasizes the importance of sorting in data organization and algorithm performance, highlighting that most sorting algorithms have a time complexity of O(n^2) except for merge sort and quicksort, which are O(n log n). Additionally, it covers the analysis of each algorithm's efficiency based on best, worst, and average cases.

Uploaded by

Selman Bayburt
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

CNG213 Lecture 3 - Sorting-Searching

The document discusses sorting algorithms, including selection sort, insertion sort, bubble sort, merge sort, and quicksort, detailing their processes, implementations, and complexities. It emphasizes the importance of sorting in data organization and algorithm performance, highlighting that most sorting algorithms have a time complexity of O(n^2) except for merge sort and quicksort, which are O(n log n). Additionally, it covers the analysis of each algorithm's efficiency based on best, worst, and average cases.

Uploaded by

Selman Bayburt
Copyright
© © All Rights Reserved
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 53

Sorting and Searching

CNG 213 - Data Structures


Lecture - 3
Dr. Meryem Erbilek

CNG 213 - Lecture 3 1/53


Introduction to Sorting
• Sorting is a process that organizes a collection of data into
either ascending or descending order.
– An internal sort requires that the collection of data fit entirely in the
computer’s main memory.
– We can use an external sort when the collection of data cannot fit
in the computer’s main memory all at once but must reside in
secondary storage such as on a disk.

• Any significant amount of computer output is generally arranged


in some sorted order so that it can be interpreted.
• Sorting also has indirect uses. An initial sort of the data can
significantly enhance the performance of an algorithm.
• Majority of programming projects use a sort somewhere, and in
many cases, the sorting cost determines the running time.

CNG 213 - Lecture 3 2/53


Sorting Algorithms

• There are many sorting algorithms, such as:


– Selection Sort
– Insertion Sort
– Bubble Sort
– Merge Sort
– Quick Sort
– Radix Sort

• The first three are the foundations for faster and more efficient
algorithms.

CNG 213 - Lecture 3 3/53


Selection Sort
• The list is divided into two sub-lists, sorted and unsorted, which
are divided by an imaginary wall.
• We find the smallest element from the unsorted sub-list and
swap it with the element at the beginning of the unsorted data.
• After each selection and swapping, the imaginary wall between
the two sub-lists move one element ahead, increasing the
number of sorted elements and decreasing the number of
unsorted ones.
• Each time we move one element from the unsorted sub-list to
the sorted sub-list, we say that we have completed a sort pass.
• A list of n elements requires n-1 passes to completely rearrange
the data.

CNG 213 - Lecture 3 4/53


Sorted Unsorted

CNG 213 - Lecture 3 5/53


Sorted Unsorted
Original List

After pass 1

After pass 2

After pass 3

After pass 4

After pass 5

CNG 213 - Lecture 3 6/53


Selection Sort
1 void selectionSort( int a[], int n)
2 {
3 for (int i = 0; i < n-1; i++)
4 {
5 int min = i;
6 for (int j = i+1; j < n; j++)
7 if (a[j] < a[min])
8 min = j;
9 swap(&a[i], &a[min]);
10 }
11 }
12
13 void swap( int *lhs, int *rhs )
14 {
15 int tmp = *lhs;
16 *lhs = *rhs;
17 *rhs = tmp;
18 }

CNG 213 - Lecture 3 7/53


Selection Sort - Analysis
• In general, we compare keys and move items (or exchange
items) in a sorting algorithm (which uses key comparisons).
– So, to analyze a sorting algorithm we should count the number of
key comparisons and the number of moves.
– Ignoring other operations does not affect our final result.

• In selectionSort function, the outer for loop executes n-1


times.

• We invoke swap function once at each iteration.


– Total Swaps: n-1
– Total Moves: 3*(n-1) (Each swap has three moves)

CNG 213 - Lecture 3 8/53


Selection Sort – Analysis
• The inner for loop executes the size of the unsorted part minus
1 (from 1 to n-1), and in each iteration we make one key
comparison.
– # of key comparisons = 1+2+...+n-1 = n*(n-1)/2
– So, Selection sort is O(n2)

• The best case, the worst case, and the average case of the
selection sort algorithm are same.  all of them are O(n2)
– This means that the behavior of the selection sort algorithm does not
depend on the initial organization of data.
– Since O(n2) grows so rapidly, the selection sort algorithm is appropriate only
for small n.
– Although the selection sort algorithm requires O(n 2) key comparisons, it only
requires O(n) moves.
– A selection sort could be a good choice if data moves are costly but key
comparisons are not costly (short keys, long records).
CNG 213 - Lecture 3 9/53
Insertion Sort
• Insertion sort is a simple sorting algorithm that is appropriate for
small inputs.
– Most common sorting technique used by card players.

• The list is divided into two parts: sorted and unsorted.

• In each pass, the first element of the unsorted part is picked up,
transferred to the sorted sub-list, and inserted at the appropriate
place.

• A list of n elements will take at most n-1 passes to sort the data.

CNG 213 - Lecture 3 10/53


Sorted Unsorted

CNG 213 - Lecture 3 11/53


Sorted Unsorted
Original List

After pass 1

After pass 2

After pass 3

After pass 4

After pass 5

CNG 213 - Lecture 3 12/53


Insertion Sort
1 void insertionSort(int a[], int n)
2 {
3 int i,j;
4 for (i = 1; i < n; i++)
5 {
6 int tmp = a[i];
7
8 for (j=i; j>0 && tmp < a[j-1]; j--)
9 a[j] = a[j-1];
10
11 a[j] = tmp;
12 }
13 }

CNG 213 - Lecture 3 13/53


Insertion Sort – Analysis
• Running time depends on not only the size of the array but also
the contents of the array.
• Best-case: O(n)
– Array is already sorted in ascending order.
– Inner loop will not be executed.
– The number of moves: 2*(n-1)  O(n)
– The number of key comparisons: (n-1)  O(n)
• Worst-case: O(n2)
– Array is in reverse order:
– Inner loop is executed i-1 times, for i = 2,3, …, n
– The number of moves: 2*(n-1)+(1+2+...+n-1)= 2*(n-1)+ n*(n-1)/2  O(n2)
– The number of key comparisons: (1+2+...+n-1)= n*(n-1)/2  O(n2)
• Average-case: O(n2)
– We have to look at all possible initial data organizations.
• So, Insertion Sort is O(n2)

CNG 213 - Lecture 3 14/53


Insertion Sort – Analysis
• Which running time will be used to characterize this algorithm?
– Best, worst or average?

• Worst:
– Longest running time (this is the upper limit for the algorithm)
– It is guaranteed that the algorithm will not be worse than this.

• Sometimes we are interested in average case. But there are some


problems with the average case.
– It is difficult to figure out the average case. i.e. what is average input?
– Are we going to assume all possible inputs are equally likely?
– In fact for most algorithms average case is the same as the worst case.

CNG 213 - Lecture 3 15/53


Bubble Sort
• The list is divided into two sub-lists: sorted and unsorted.

• The smallest element is bubbled from the unsorted list and moved
to the sorted sub-list.

• After that, the wall moves one element ahead, increasing the
number of sorted elements and decreasing the number of
unsorted ones.

• Each time an element moves from the unsorted part to the sorted
part one sort pass is completed.

• Given a list of n elements, bubble sort requires up to n-1 passes to


sort the data.

CNG 213 - Lecture 3 16/53


Sorted Unsorted

CNG 213 - Lecture 3 17/53


Sorted Unsorted
Original List

After pass 1

After pass 2

After pass 3

After pass 4

CNG 213 - Lecture 3 18/53


Bubble Sort
1 void bubbleSort(int a[], int n)
2 {
3 int sorted = 0;
4 int last = n-1;
5 for (int i = 0; (i < last) && !sorted; i++)
6 {
7 sorted = 1;
8 for (int j=last; j > i; j--)
9 if (a[j-1] > a[j])
10 {
11 Swap(&a[j],&a[j-1]);
12 sorted = 0; // signal exchange
13 }
14 }
15 }

CNG 213 - Lecture 3 19/53


Bubble Sort – Analysis
• Best-case: O(n)
– Array is already sorted in ascending order.
– The number of moves: 0  O(1)
– The number of key comparisons: (n-1)  O(n)
• Worst-case: O(n2)
– Array is in reverse order:
– Outer loop is executed n-1 times,
– The number of moves: 3*(1+2+...+n-1) = 3 * n*(n-1)/2  O(n2)
– The number of key comparisons: (1+2+...+n-1)= n*(n-1)/2  O(n2)
• Average-case: O(n2)
– We have to look at all possible initial data organizations.
• So, Bubble Sort is O(n2)

CNG 213 - Lecture 3 20/53


Merge Sort
• Merge sort algorithm is one of two important divide-and-conquer
sorting algorithms (the other one is quicksort).

• It is a recursive algorithm.
– Divides the list into halves,
– Sort each halve separately, and
– Then merge the sorted halves into one sorted array.

CNG 213 - Lecture 3 21/53


Merge Sort - Example

CNG 213 - Lecture 3 22/53


Merge Sort – Example 2

CNG 213 - Lecture 3 23/53


Merge

CNG 213 - Lecture 3 24/53


Merge
1 const int MAX_SIZE = maximum-number-of-items-in-array;
2 void merge(int theArray[], int first, int mid, int last){
3 int tempArray[MAX_SIZE]; // temporary array
4 int first1 = first; // beginning of first subarray
5 int last1 = mid; // end of first subarray
6 int first2 = mid + 1; // beginning of second subarray
7 int last2 = last; // end of second subarray
8 int index = first1; // next available location in tempArray
9 for ( ; (first1 <= last1) && (first2 <= last2); ++index)
10 {
11 if (theArray[first1] < theArray[first2])
12 { tempArray[index] = theArray[first1]; ++first1; }
13 else { tempArray[index] = theArray[first2]; ++first2; }
14 }
15 // finish off the first subarray, if necessary
16 for (; first1 <= last1; ++first1, ++index)
17 tempArray[index] = theArray[first1];
18 // finish off the second subarray, if necessary
19 for (; first2 <= last2; ++first2, ++index)
20 tempArray[index] = theArray[first2];
21 // copy the result back into the original array
22 for (index = first; index <= last; ++index)
23 theArray[index] = tempArray[index];
24 } // end merge

CNG 213 - Lecture 3 25/53


Mergesort
1 void mergesort(int theArray[], int first, int last)
2 {
3 if (first < last)
4 {
5 int mid = (first + last)/2; // index of midpoint
6 mergesort(theArray, first, mid);
7 mergesort(theArray, mid+1, last);
8
9 // merge the two halves
10 merge(theArray, first, mid, last);
11 }
12 } // end mergesort

CNG 213 - Lecture 3 26/53


Mergesort – Analysis of Merge

• A worst-case instance of the merge step in mergesort

• The complexity of the merge operation for merging two lists of


size n/2 is O(n).

CNG 213 - Lecture 3 27/53


Mergesort - Analysis
• The complexity of mergesort can be defined using the following
recurrence equation:
T(n) = 2T(n/2) + n
Solving this equation:
T(1) = 1
T(n) = 2 T(n/2) + n
1st subst: = 2 ( 2 T(n/4) + n/2) + n
= 4 T(n/4) + 2n
2nd subst: = 4 (2 T(n/8) + n/4) + 2n
= 8 T(n/8) + 3n
In general: T(n) = 2k T(n/2k) + kn
When k= log2n: = 2logn T(n/2logn) + n logn
 O(nlogn)

CNG 213 - Lecture 3 28/53


Mergesort – Analysis

• Mergesort is extremely efficient algorithm with respect to time.


– Best, worst and average cases are O (n * log2n )

• But, mergesort requires an extra array whose size equals to the


size of the original array.

• If we use a linked list, we do not need an extra array


– But, we need space for the links
– And, it will be difficult to divide the list into half ( O(n) )

CNG 213 - Lecture 3 29/53


Quicksort

• Like mergesort, Quicksort is also based on the divide-and-


conquer paradigm.
• But it uses this technique in a somewhat opposite manner, as
all the hard work is done before the recursive calls.
• It works as follows:
1. First, it partitions an array into two parts using a selected
pivot element,
2. Then, it sorts the parts independently,
3. Finally, it combines the sorted sub-sequences by a simple
concatenation.

CNG 213 - Lecture 3 30/53


Quicksort

The quick-sort algorithm consists of the following three steps:


1. Divide: Partition the list.
– To partition the list, we first choose some element from the
list for which we hope about half the elements will come
before and half after. Call this element the pivot.
– Then we partition the elements so that all those with values
less than the pivot come in one sub-list and all those with
greater values come in another.
2. Recursion: Recursively sort the sub-lists separately.
3. Conquer: Put the sorted sub-lists together.

CNG 213 - Lecture 3 31/53


Partition

• Partitioning places the pivot in its correct place position within


the array.

• Arranging the array elements around the pivot p generates two smaller sorting
problems.
– Sort the left section of the array, and sort the right section of the array.
– When these two smaller sorting problems are solved recursively, our bigger sorting problem is solved.

CNG 213 - Lecture 3 32/53


Partition – Choosing the pivot
• First, we have to select a pivot element among the elements of the
given array, and we put this pivot into the first location of the array
before partitioning.

• Which array item should be selected as a pivot?


– Somehow we have to select a pivot, and we hope that we will
get a good partitioning.
– If the items in the array are arranged randomly, we choose a
pivot randomly.
– We can choose the first or last element as a pivot (it may not
give a good partitioning).
– We can use different techniques to select the pivot such as
median of three.

CNG 213 - Lecture 3 33/53


Partitioning

The initial state of the array:

CNG 213 - Lecture 3 34/53


Partitioning (cont.)

Invariant for the partition algorithm:

CNG 213 - Lecture 3 35/53


Partitioning (cont.)

Moving theArray[firstUnknown]into S2 by incrementing firstUnknown.

CNG 213 - Lecture 3 36/53


Partitioning (cont.)
Moving theArray[firstUnknown]into S1 by swapping it with
theArray[lastS1+1] and by incrementing both lastS1 and firstUnknown.

CNG 213 - Lecture 3 37/53


Partition Function
1 template <class DataType>
2 void partition(DataType theArray[], int first, int last, int &pivotIndex) {
3 // Partitions an array for quicksort.
4 // Precondition: first <= last.
5 // Postcondition: Partitions theArray[first..last] such that:
6 // S1 = theArray[first..pivotIndex-1] < pivot
7 // theArray[pivotIndex] == pivot
8 // S2 = theArray[pivotIndex+1..last] >= pivot
9 // Calls: choosePivot and swap.
10 // place pivot in theArray[first]
11 int i = choosePivot(theArray, first, last);
12 swap(theArray[i], theArray[first]);
13 DataType pivot = theArray[first]; // copy pivot
14 // initially, everything but pivot is in unknown
15 int lastS1 = first; // index of last item in S1
16 int firstUnknown = first + 1; //index of 1st item in unknown
17 // move one item at a time until unknown region is empty
18 for (; firstUnknown <= last; ++firstUnknown) {
19 if (theArray[firstUnknown] < pivot)
20 { ++lastS1; swap(theArray[firstUnknown], theArray[lastS1]);}

21 }
22 // place pivot in proper position and mark its location
23 swap(theArray[first], theArray[lastS1]);
24 pivotIndex = lastS1;
25 } CNG 213 - Lecture 3 38/53
Quicksort Function
1 void quicksort(DataType theArray[], int first, int last) {
2 // Sorts the items in an array into ascending order.
3 // Precondition: theArray[first..last] is an array.
4 // Postcondition: theArray[first..last] is sorted.
5 // Calls: partition.
6 int pivotIndex;
7 if (first < last) {
8 // create the partition: S1, pivot, S2
9 partition(theArray, first, last, pivotIndex);
10 // sort regions S1 and S2
11 quicksort(theArray, first, pivotIndex-1);
12 quicksort(theArray, pivotIndex+1, last);
13 }
14 }

CNG 213 - Lecture 3 39/53


Quicksort – Analysis
• Worst case: If we always select the smallest or largest element as
the pivot, we’ll not be able to divide the array into similar size
partitions. In that case, the complexity of the quicksort can be
defined by:
– T(n) = n + T(1) + T(n-1)
– This gives O(n2) complexity
• Best case: Partitions are equal sized:
– T(n) = n + 2T(n/2) (same as mergesort)
– This gives O(nlogn) complexity
• Average case: Quicksort has been proven to have O(nlogn)
complexity
• It also does not need an extra array like mergesort
• Therefore, it is the most popular sorting algorithm

CNG 213 - Lecture 3 40/53


Radix Sort
• Radix sort algorithm different than other sorting algorithms that we
talked.
– It does not use key comparisons to sort an array.

• The radix sort :


– Treats each data item as a character string.
– First it groups data items according to their rightmost
character, and put these groups into order w.r.t. this rightmost
character.
– Then, combine these groups.
– We, repeat these grouping and combining operations for all
other character positions in the data items from the rightmost
to the leftmost character position.
– At the end, the sort operation will be completed.

CNG 213 - Lecture 3 41/53


Radix Sort – Example
mom, dad, god, fat, bad, cat, mad, pat, bar, him original list
(dad,god,bad,mad) (mom,him) (bar) (fat,cat,pat) group strings by
rightmost letter
dad,god,bad,mad,mom,him,bar,fat,cat,pat combine groups

(dad,bad,mad,bar,fat,cat,pat) (him) (god,mom) group strings by middle


letter
dad,bad,mad,bar,fat,cat,pat,him,god,mom combine groups

(bad,bar) (cat) (dad) (fat) (god) (him) (mad,mom) (pat) group strings by first
letter

bad,bar,cat,dad,fat,god,him,mad,mom,par combine groups (SORTED)

CNG 213 - Lecture 3 42/53


Radix Sort – Example

CNG 213 - Lecture 3 43/53


Radix Sort - Algorithm
1 radixSort(inout theArray:ItemArray, in n:integer, in d:integer)
2 // sort n d-digit integers in the array theArray
3 for (j=d down to 1)
4 {
5 Initialize 10 groups to empty
6 Initialize a counter for each group to 0
7 for (i=0 through n-1)
8 {
9 k = jth digit of theArray[i]
10 Place theArrar[i] at the end of group k
11 Increase kth counter by 1
12 }
13 Replace the items in theArray with all the items in group 0,
14 followed by all the items in group 1, and so on.
15 }

CNG 213 - Lecture 3 44/53


Radix Sort - Analysis
• The radix sort algorithm requires 2*n*d moves to sort n strings of
d characters each.
– So, Radix Sort is O(n)

• Although the radix sort is O(n), it is not appropriate as a general-


purpose sorting algorithm.
– Its memory requirement is d * original size of data (because
each group should be big enough to hold the original data
collection.)
– For example, we need 27 groups to sort string of uppercase
letters.
– The radix sort is more appropriate for a linked list than an array
(we will not need the huge memory in this case).

CNG 213 - Lecture 3 45/53


Comparison of Sorting Algorithms

CNG 213 - Lecture 3 46/53


Searching
• One of the most common and time-consuming operations in CS
is searching.

• It is the process used to find the location of a target among a list


of objects.

• Two basic search algorithms with arrays:


– Sequential search
– Binary search

CNG 213 - Lecture 3 47/53


1. Sequential Search
• It is used when the list is not ordered.

CNG 213 - Lecture 3 48/53


Sequential Search – Complexity

• Sequential search is also known as linear search


– Complexity is O(n)

• There are variations of this approach….For example:


– Sentinel search (reduce the search condition to one)
– Probability search (the array is ordered with the most
probable at the beginning of the array and the least probable
ones at the end)

CNG 213 - Lecture 3 49/53


2. Binary Search
• The binary search locates an item by repeatedly
dividing the list in half.

CNG 213 - Lecture 3 50/53


Binary Search Complexity

CNG 213 - Lecture 3 51/53


Binary vs. Sequential Search

CNG 213 - Lecture 3 52/53


Summary

• Sorting is a process that organizes a collection of data into


either ascending or descending order.
• Different sorting algorithms with different performance.
• Mergesort and quicksort algorithms are important divide-and-
conquer sorting algorithms.
• Searching is the process used to find location of a target among
a list of objects.
• Different searching strategies can be used to search an ordered
and unordered lists.

CNG 213 - Lecture 3 53/53

You might also like