CS3401 Unit 1 Notes
CS3401 Unit 1 Notes
CS3401 – ALGORITHMS
CS3401 ALGORITHMS
COURSE OBJECTIVES
To understand and apply the algorithm analysis techniques on searching and
sorting algorithms
To critically analyze the efficiency of graph algorithms
To understand different algorithm design techniques
To solve programming problems using state space tree
To understand the concepts behind NP Completeness, Approximation algorithms
and randomized algorithms.
UNIT I INTRODUCTION
Algorithm analysis: Time and space complexity - Asymptotic Notations and its
properties Best case, Worst case and average case analysis – Recurrence relation:
substitution method - Lower bounds – searching: linear search, binary search and
Interpolation Search, Pattern search: The naïve string-matching algorithm - Rabin-Karp
algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort – heap sort
Divide and Conquer methodology: Finding maximum and minimum - Merge sort -
Quick sort Dynamic programming: Elements of dynamic programming — Matrix-chain
multiplication - Multi stage graph — Optimal Binary Search Trees. Greedy Technique:
Elements of the greedy strategy - Activity-selection problem –- Optimal Merge pattern
— Huffman Trees.
45 PERIODS
COURSE OUTCOMES: Upon completion of the course, the students will be able to
CO1: Analyze the efficiency of algorithms using various frameworks
CO2: Apply graph algorithms to solve problems and analyze their efficiency.
CO3: Make use of algorithm design techniques like divide and conquer, dynamic
programming and greedy techniques to solve problems
CO4: Use the state space tree method for solving problems.
CO5: Solve problems using approximation algorithms and randomized algorithms
TEXT BOOKS:
1. Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest and Clifford Stein,
"Introduction to
Algorithms", 3rd Edition, Prentice Hall of India, 2009.
2. Ellis Horowitz, Sartaj Sahni, Sanguthevar Rajasekaran “Computer Algorithms/C++”
Orient
Blackswan, 2nd Edition, 2019.
REFERENCES:
1. Anany Levitin, “Introduction to the Design and Analysis of Algorithms”, 3rd Edition,
Pearson
Education, 2012.
2. Alfred V. Aho, John E. Hopcroft and Jeffrey D. Ullman, "Data Structures and
Algorithms",
Reprint Edition, Pearson Education, 2006.
3. S. Sridhar, “Design and Analysis of Algorithms”, Oxford university press, 2014.
UNIT I INTRODUCTION
Algorithm analysis: Time and space complexity - Asymptotic Notations and its
properties Best case, Worst case and average case analysis – Recurrence relation:
substitution method - Lower bounds – searching: linear search, binary search and
Interpolation Search, Pattern search: The naïve string-matching algorithm - Rabin-Karp
algorithm - Knuth-Morris-Pratt algorithm. Sorting: Insertion sort – heap sort
Part A
Q.no Questions CO PO
1. What is time complexity? C214.1 1,2
2. Define space complexity in an algorithm. C214.1 1,3
3. What is Big O notation? C214.1 1,3
Compare O(n) and O(n²) time complexity with
4. C214.1 1,3
examples.
What is the difference between worst-case, best-case,
5. C214.1 1,3
and average-case time complexity?
Part B
Explain different time complexities (O(1), O(log n),
1. O(n), O(n log n), O(n²), O(2^n)) with real-world C214.1 1,2,3,5,11
examples.
Discuss the trade-offs between time and space
2. C214.1 1,3,5
complexity with examples.
Part C
Explain time and space complexity with Big O
1. notation. Provide examples for different complexities C214.1 1,2,3,4,5,11,12
and analyze their efficiency.
}
return 0; // return 0, if you didn't find "k"
}
/*
* [Explanation]
* i = 0 > will be executed once
* i < n > will be executed n+1 times
* i++ > will be executed n times
* if(arr[i] == k) --> will be executed n times
* return 1 > will be executed once(if "k" is there in the array)
* return 0 > will be executed once(if "k" is not there in the array)
*/
Each statement in code takes constant time, let's say "C", where "C" is some constant.
So, whenever we declare an integer then it takes constant time when we change the value
of some integer or other variables then it takes constant time, when we compare two
variables then it takes constant time. So, if a statement is taking "C" amount of time and
it is executed "N" times, then it will take C*N amount of time. Now, think of the
following inputs to the above algorithm that we have just written:
NOTE: Here we assume that each statement is taking 1sec of time to execute.
• If the input array is [1, 2, 3, 4, 5] and you want to find if "1" is present in the array
or not, then the if-condition of the code will be executed 1 time and it will find that the
element 1 is there in the array. So, the if-condition will take 1 second here.
• If the input array is [1, 2, 3, 4, 5] and you want to find if "3" is present in the array
or not, then the if-condition of the code will be executed 3 times and it will find that the
element 3 is there in the array. So, the if-condition will take 3 seconds here.
• If the input array is [1, 2, 3, 4, 5] and you want to find if "6" is present in the array
or not, then the if-condition of the code will be executed 5 times and it will find that the
element 6 is not there in the array and the algorithm will return 0 in this case. So, the if-
condition will take 5 seconds here.
As we can see that for the same input array, we have different time for different values of
"k". So, this can be divided into three cases:
• Best case: This is the lower bound on running time of an algorithm. We must
know the case that causes the minimum number of operations to be executed. In the
above example, our array was [1, 2, 3, 4, 5] and we are finding if "1" is present in the
array or not. So here, after only one comparison, we will get that ddelement is present in
the array. So, this is the best case of our algorithm.
• Average case: We calculate the running time for all possible inputs, sum all the
calculated values and divide the sum by the total number of inputs. We must know (or
predict) distribution of cases.
• Worst case: This is the upper bound on running time of an algorithm. We must
know the case that causes the maximum number of operations to be executed. In our
example, the worst case can be if the given array is [1, 2, 3, 4, 5] and we try to find if
element "6" is present in the array or not. Here, the if-condition of our loop will be
executed 5 times and then the algorithm will give "0" as output.
So, we learned about the best, average, and worst case of an algorithm. Now, let's get
back to the asymptotic notation where we saw that we use three asymptotic notation to
represent the complexity of an algorithm i.e. Θ Notation (theta), Ω Notation, Big O
Notation.
NOTE: In the asymptotic analysis, we generally deal with large input size.
Θ Notation (theta)
The Θ Notation is used to find the average bound of an algorithm i.e. it defines an upper
bound and a lower bound, and your algorithm will lie in between these levels. So, if a
function is g(n), then the theta representation is shown as Θ(g(n)) and the relation is
shown as:
Θ(g(n)) = { f(n): there exist positive constants c1, c2 and n0 such that 0 ≤ c1g(n) ≤ f(n) ≤
c2g(n) for all n ≥ n0 }
The above expression can be read as theta of g(n) is defined as set of all the functions
f(n) for which there exists some positive constants c1, c2, and n0 such that c1*g(n) is
less than or equal to f(n) and f(n) is less than or equal to c2*g(n) for all n that is greater
than or equal to n0.
For example:
if f(n) = 2n² + 3n + 1 and g(n) = n²
then for c1 = 2, c2 = 6, and n0 = 1, we can say that f(n) = Θ(n²)
Ω Notation
The Ω notation denotes the lower bound of an algorithm i.e. the time taken by the
algorithm can't be lower than this. In other words, this is the fastest time in which the
algorithm will return a result.
Its the time taken by the algorithm when provided with its best-case input. So, if a
function is g(n), then the omega representation is shown as Ω(g(n)) and the relation is
shown as:
Ω(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ cg(n) ≤ f(n) for all n
≥ n0 }
The above expression can be read as omega of g(n) is defined as set of all the functions
f(n) for which there exist some constants c and n0 such that c*g(n) is less than or equal
to f(n), for all n greater than or equal to n0.
if f(n) = 2n² + 3n + 1 and g(n) = n²
then for c = 2 and n0 = 1, we can say that f(n) = Ω(n²)
Big O Notation
The Big O notation defines the upper bound of any algorithm i.e. you algorithm can't
take more time than this time. In other words, we can say that the big O notation denotes
the maximum time taken by an algorithm or the worst-case time complexity of an
algorithm. So, big O notation is the most used notation for the time complexity of an
algorithm. So, if a function is g(n), then the big O representation of g(n) is shown as
O(g(n)) and the relation is shown as:
O(g(n)) = { f(n): there exist positive constants c and n0 such that 0 ≤ f(n) ≤ cg(n) for all n
≥ n0 }
The above expression can be read as Big O of g(n) is defined as a set of functions f(n)
for which there exist some constants c and n0 such that f(n) is greater than or equal to 0
and f(n) is smaller than or equal to c*g(n) for all n greater than or equal to n0.
if f(n) = 2n² + 3n + 1 and g(n) = n²
then for c = 6 and n0 = 1, we can say that f(n) = O(n²)
So, until now, we saw 3 solutions for the same problem. Now, which algorithm will you
prefer to use when you are finding the sum of first "n" numbers? If your answer is O(1)
solution, then we have one bonus section for you at the end of this blog. We would
prefer the O(1) solution because the time taken by the algorithm will be constant
irrespective of the input size.
Part A
Q.no Questions CO PO
1. What is asymptotic notation? C214.1 1
2. Define Big O notation with an example. C214.1 1,2
What is the difference between O(n) and O(n²)
3. C214.1 1,3
complexities?
4. Explain the significance of Big Ω notation. C214.1 1,3
5. What do you mean by Best-case complexity? C214.1 1,3
Part A
Q.no Questions CO PO
1. What is a recurrence relation? C214.1 1,2,3
2. Define the Master Theorem with an example. C214.1 1,2
3. Write the recurrence relation for Merge Sort. C214.1 1,3
Part B
Solve the recurrence relation T(n) = 2T(n/2) + O(n)
1. C214.1 1,2,3,5
using the Master Theorem and discuss its significance.
Part C
1. Explain different methods for solving recurrence C214.1 1,2,3,4,5,11,12
relations, including Substitution, Recursion Tree, and
Master Theorem, with examples.
1. Linear Search
2. Binary Search
3. Interpolation search
Linear Search
Linear search, often known as sequential search, is the most basic search technique. In
this type of search, we go through the entire list and try to fetch a match for a single
element. If we find a match, then the address of the matching target element is returned.
On the other hand, if the element is not found, then it returns a NULL value. Following
is a step-by-step approach employed to perform Linear Search Algorithm.
Algorithm and Pseudocode of Linear Search Algorithm Algorithm of the Linear Search
Algorithm
Step 1: The searched element 39 is compared to the first element of an array, which is
13.
The match is not found, you now move on to the next element and try to implement a
comparison.
Step 2: Now, search element 39 is compared to the second element of an array, 9.
Step 3: Now, search element 39 is compared with the third element, which is 21.
Again, both the elements are not matching, you move onto the next following element.
Step 4; Next, search element 39 is compared with the fourth element, which is 15.
Step 5: Next, search element 39 is compared with the fifth element 39.
A perfect match is found, display the element found at location 4.
Binary Search
Binary search is the search technique that works efficiently on sorted lists. Hence, to
search an element into some list using the binary search technique, we must ensure that
the list is sorted.
Binary search follows the divide and conquer approach in which the list is divided into
two halves, and the item is compared with the middle element of the list. If the match is
found then, the location of the middle element is returned. Otherwise, we search into
either of the halves depending upon the result produced through the match
NOTE: Binary search can be implemented on sorted array elements. If the list elements
are not arranged in a sorted manner, we have first to sort them.
Algorithm
1. Binary_Search(a, lower_bound, upper_bound, val) // 'a' is the given array,
'lower_bound' is t he index of the first array element, 'upper_bound' is the index of the
last array element, 'val' is the value to search
2. Step 1: set beg = lower_bound, end = upper_bound, pos = - 1
3. Step 2: repeat steps 3 and 4 while beg <=end
4. Step 3: set mid = (beg + end)/2
5. Step 4: if a[mid] = val
6. set pos = mid
7. print pos
8. go to step 6
9. else if a[mid] > val
10. set end = mid - 1
11. else
12. set beg = mid + 1
13. [end of if]
14. [end of loop]
15. Step 5: if pos = -1
beg = 0
end = 8
mid = (0 + 8)/2 = 4. So, 4 is the mid of the array.
Now, the element to search is found. So algorithm will return the index of the element
matched. Binary Search complexity
Now, let's see the time complexity of Binary search in the best case, average case, and
worst case. We will also see the space complexity of Binary search.
1. Time Complexity
Best Case Complexity - In Binary search, best case occurs when the element to search is
found in first comparison, i.e., when the first middle element itself is the element to be
searched. The best-case time complexity of Binary search is O(1).
Average Case Complexity - The average case time complexity of Binary search is
O(logn).
Worst Case Complexity - In Binary search, the worst case occurs, when we have to keep
reducing the search space till it has only one element. The worst-case time complexity of
Binary search is O(logn).
2. Space Complexity
The space complexity of binary search is O(1).
Interpolation Search
Interpolation search is an improved variant of binary search. This search algorithm
works on the probing position of the required value. For this algorithm to work properly,
the data collection should be in a sorted form and equally distributed.
Binary search has a huge advantage of time complexity over linear search. Linear search
has worst- case complexity of Ο(n) whereas binary search has Ο(log n).
There are cases where the location of target data may be known in advance. For
example, in case of a telephone directory, if we want to search the telephone number of
Morphius. Here, linear search and even binary search will seem slow as we can directly
jump to memory space where the names start from 'M' are stored.
Position Probing in Interpolation Search
Interpolation search finds a particular item by computing the probe position. Initially, the
probe position is the position of the middle most item of the collection.
If a match occurs, then the index of the item is returned. To split the list into two parts,
we use the following method −
mid = Lo + ((Hi - Lo) / (A[Hi] - A[Lo])) * (X - A[Lo])
where − A = list
Lo = Lowest index of the list Hi = Highest index of the list
A[n] = Value stored at index n in the list
If the middle item is greater than the item, then the probe position is again calculated in
the sub- array to the right of the middle item. Otherwise, the item is searched in the
subarray to the left of the middle item. This process continues on the sub-array as well
until the size of subarray reduces to zero.
Runtime complexity of interpolation search algorithm is Ο(log (log n)) as compared to
Ο(log n) of BST in favorable situations.
Algorithm
As it is an improvisation of the existing BST algorithm, we are mentioning the steps to
search the 'target' data value index, using position probing −
Step 1 − Start searching data from middle of the list.
Step 2 − If it is a match, return the index of the item, and exit. Step 3 − If it is not a
match, probe position.
Step 4 − Divide the list using probing formula and find the new midle. Step 5 − If data is
greater than middle, search in higher sub-list.
Step 6 − If data is smaller than middle, search in lower sub-list. Step 7 − Repeat until
match.
if A[Mid] = X
EXIT: Success, Target found at Mid else
if A[Mid] < X
Set Lo to Mid+1 else if A[Mid] > X Set Hi to Mid-1
end if end if
End While End Procedure
Implementation of interpolation in C
comparisons++;
// probe the mid point
mid = lo + (((double)(hi - lo) / (list[hi] - list[lo])) * (data - list[lo])); printf("mid = %d\
n",mid);
// data found if(list[mid] == data) {
index = mid; break;
} else {
if(list[mid] < data) {
// if data is larger, data is in upper half lo = mid + 1;
} else {
// if data is smaller, data is in lower half hi = mid - 1;
}
}
}
Time Complexity
• Bestcase-O(1)
The best-case occurs when the target is found exactly as the first expected position
computed using the formula. As we only perform one comparison, the time complexity
is O(1).
• Worst-case-O(n)
The worst case occurs when the given data set is exponentially distributed.
• Averagecase-O(log(log(n)))
If the data set is sorted and uniformly distributed, then it takes O(log(log(n))) time as on
an average (log(log(n))) comparisons are made.
Space Complexity
O(1) as no extra space is required.
Part A
Q.no Questions CO PO
1. What is searching in data structures? C214.1 1
2. Define Linear Search. C214.1 1,2
Write the time complexity of Binary Search in the
3. C214.1 1,3
worst case.
4. Compare Linear Search and Binary Search. C214.1 1,2,3
5. What is the prerequisite for applying Binary Search? C214.1 1,3
What is the best-case time complexity of Linear
6. C214.1 1,2
Search?
7. Define Interpolation Search. C214.1 1,3
Algorithm
naive_algorithm(pattern, text)
Input − The text and the pattern
Output − locations, where the pattern is present in the text Start
pat_len := pattern Size
Implementation in C
#include <stdio.h> #include <string.h> int main (){
char txt[] = "tutorialsPointisthebestplatformforprogrammers"; char pat[] = "a";
int M = strlen (pat); int N = strlen (txt);
for (int i = 0; i <= N - M; i++){ int j;
for (j = 0; j < M; j++) if (txt[i + j] != pat[j])
break;
if (j == M)
printf ("Pattern matches at index %d
", i);
}
return 0;
}
Output
Pattern matches at 6 Pattern matches at 25 Pattern matches at 39
Part A
Q.no Questions CO PO
1. What is the Rabin-Karp Algorithm? C214.1 1,2,3
2. Define the concept of hashing in Rabin-Karp C214.1 1,2
What is the worst-case time complexity of the Rabin-
3. C214.1 1,3
Karp Algorithm?
How does KMP improve over Naïve Pattern
4. C214.1 1,2,3
Searching?
Part B
Explain the Rabin-Karp Algorithm with an example.
1. C214.1 1,2,3,5
Write a C program for Rabin-Karp pattern matching.
Explain the Knuth-Morris-Pratt (KMP) Algorithm
2. with an example. Write a C program for KMP pattern C214.1 1,2,3,4,5
searching.
Part C
Compare Naïve Pattern Search, Rabin-Karp, and KMP
Algorithms in terms of time complexity and practical
1. C214.1 1,2,3,4,5,11,12
applications. Implement all three in C and analyze
their efficiency.
Insertion sort works similar to the sorting of playing cards in hands. It is assumed that
the first card is already sorted in the card game, and then we select an unsorted card. If
the selected unsorted card is greater than the first card, it will be placed at the right side;
otherwise, it will be placed at the left side. Similarly, all unsorted cards are taken and put
in their exact place.
The same approach is applied in insertion sort. The idea behind the insertion sort is that
first take one element, iterate it through the sorted array. Although it is simple to use, it
is not appropriate for large data sets as the time complexity of insertion sort in the
average case and worst case is O(n2), where n is the number of items. Insertion sort is
less efficient than the other sorting algorithms like heap sort, quick sort, merge sort, etc.
Algorithm
The simple steps of achieving the insertion sort are listed as follows -
Step 1 - If the element is the first element, assume that it is already sorted. Return 1.
Step2 - Pick the next element, and store it separately in a key. Step3 - Now, compare the
key with all elements in the sorted array.
Step 4 - If the element in the sorted array is smaller than the current element, then move
to the next element. Else, shift greater elements in the array towards the right.
Step 5 - Insert the value.
Step 6 - Repeat until the array is sorted. Working of Insertion sort Algorithm
Now, let's see the working of the insertion sort Algorithm.
To understand the working of the insertion sort algorithm, let's take an unsorted array. It
will be easier to understand the insertion sort via an example.
Here, 31 is greater than 12. That means both elements are already in ascending order. So,
for now, 12 is stored in a sorted sub-array.
Here, 25 is smaller than 31. So, 31 is not at correct position. Now, swap 31 with 25.
Along with swapping, insertion sort will also check it with all elements in the sorted
array.
For now, the sorted array has only one element, i.e. 12. So, 25 is greater than 12. Hence,
the sorted array remains sorted after swapping.
Now, two elements in the sorted array are 12 and 25. Move forward to the next elements
that are 31 and 8.
Now, the sorted array has three items that are 8, 12 and 25. Move to the next items that
are 31 and 32.
Hence, they are already sorted. Now, the sorted array includes 8, 12, 25 and 31.
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array
is already sorted. The best-case time complexity of insertion sort is O(n).
o Average Case Complexity - It occurs when the array elements are in jumbled
order that is not properly ascending and not properly descending. The average case time
complexity of insertion sort is O(n2).
o Worst Case Complexity - It occurs when the array elements are required to be
sorted in reverse order. That means suppose you have to sort the array elements in
ascending order, but its elements are in descending order. The worst-case time
complexity of insertion sort is O(n2).
2. Space Complexity
Space Complexity O(1)
Stable YES
o The space complexity of insertion sort is O(1). It is because, in insertion sort, an
extra variable is required for swapping.
Implementation of insertion sort
Program: Write a program to implement insertion sort in C language.
1. #include <stdio.h>
2.
3. void insert(int a[], int n) /* function to sort an aay with insertion sort */
4. {
5. int i, j, temp;
9. largest = R
10. if largest != i
11. swap arr[i] with arr[largest]
12. MaxHeapify(arr,largest)
13. End
Working of Heap sort Algorithm
In heap sort, basically, there are two phases involved in the sorting of elements. By using
the heap sort algorithm, they are as follows -
o The first step includes the creation of a heap by adjusting the elements of the
array.
o After the creation of heap, now remove the root element of the heap repeatedly by
shifting it to the end of the array, and then store the heap structure with the remaining
elements.
First, we have to construct a heap from the given array and convert it into max heap.
After converting the given heap into max heap, the array elements are –
Next, we have to delete the root element (89) from the max heap. To delete this node, we
have to swap it with the last node, i.e. (11). After deleting the root element, we again
have to heapify it to convert it into max heap.
After swapping the array element 89 with 11, and converting the heap into max-heap, the
elements of array are -
In the next step, again, we have to delete the root element (81) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (54). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 81 with 54 and converting the heap into max-heap, the
elements of array are -
In the next step, we have to delete the root element (76) from the max heap again. To
delete this node, we have to swap it with the last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 76 with 9 and converting the heap into max-heap, the
elements of array are -
In the next step, again we have to delete the root element (54) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (14). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 54 with 14 and converting the heap into max-heap, the
elements of array are -
In the next step, again we have to delete the root element (22) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (11). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 22 with 11 and converting the heap into max-heap, the
elements of array are -
In the next step, again we have to delete the root element (14) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 14 with 9 and converting the heap into max-heap, the
elements of array are -
In the next step, again we have to delete the root element (11) from the max heap. To
delete this node, we have to swap it with the last node, i.e. (9). After deleting the root
element, we again have to heapify it to convert it into max heap.
After swapping the array element 11 with 9, the elements of array are -
Now, heap has only one element left. After deleting it, heap will be empty.
Time complexity of Heap sort in the best case, average case, and worst case
1. Time Complexity
o Best Case Complexity - It occurs when there is no sorting required, i.e. the array
is already sorted. The best-case time complexity of heap sort is O(n logn).
o Average Case Complexity - It occurs when the array elements are in jumbled
order that is not properly ascending and not properly descending. The average case time
complexity of heap sort is O(n log n).
o Worst Case Complexity - It occurs when the array elements are required to be
sorted in reverse order. That means suppose you have to sort the array elements in
ascending order, but its elements are in descending order. The worst-case time
complexity of heap sort is O(n log n).
The time complexity of heap sort is O(n logn) in all three cases (best case, average case,
and worst case). The height of a complete binary tree having n elements is logn.
2. Space Complexity
Part A
Q.no Questions CO PO
1. What is Insertion Sort? C214.1 1
Define the best and worst-case time complexity of
2. C214.1 1,2
Insertion Sort.
3. What is Heap Sort? C214.1 1
4. How does Heap Sort work? C214.1 1,2,3
Compare Insertion Sort and Heap Sort in terms of
5. C214.1 1,3
efficiency.
What are the advantages of Heap Sort over Insertion
6. C214.1 1,5
Sort?
7. Explain why Insertion Sort is better for small datasets. C214.1 1,2