DSA SPL Notes
DSA SPL Notes
⚫ An Algorithm is a finite sequence of instructions, each of which has a clear meaning and can be
performed with a finite amount of effort in a finite length of time
⚫ The time needed by an algorithm expressed as a function of the size of a problem is called the time
complexity of the algorithm. The time complexity of a program is the amount of computer time it
needs to run to completion.
⚫ The space complexity of a program is the amount of memory it needs to run to completion.
⚫ 1. Best Case : The minimum possible value of f(n) is called the best case.
⚫ 2. Average Case : The expected value of f(n).
⚫ 3. Worst Case : The maximum value of f(n) for any key possible input.
⚫ A recurrence relation is an equation that defines a sequence based on a rule that gives the next
term as a function of the previous term(s). The simplest form of a recurrence relation is the case
where the next term depends only on the immediately previous term.
⚫ Asymototic notations :
1) Θ Notation: The theta notation bounds a functions from above and below, so it defines exact
asymptotic behavior.
A simple way to get Theta notation of an expression is to drop low order terms and ignore leading
constants
2) Big O Notation: The Big O notation defines an upper bound of an algorithm, it bounds a function
only from above. For example, consider the case of Insertion Sort. It takes linear time in best case and
quadratic time in worst case. We can safely say that the time complexity of Insertion sort is O(n^2). Note
that O(n^2) also covers linear time.
3) Ω Notation: Just as Big O notation provides an asymptotic upper bound on a function, Ω notation
provides an asymptotic lower bound.
Ω Notation can be useful when we have lower bound on time complexity of an algorithm. As discussed in
the previous post, the best case performance of an algorithm is generally not useful, the Omega notation
is the least used notation among all three.
⚫ There are mainly three ways for solving recurrences.
1) Substitution Method
2) Recurrence Tree Method
3) Master Method:
⚫ Binary Search:
Search a sorted array by repeatedly dividing the search interval in half. Begin with an interval covering
the whole array. If the value of the search key is less than the item in the middle of the interval, narrow the
interval to the lower half. Otherwise narrow it to the upper half.
We basically ignore half of the elements just after one comparison.
1. Compare x with the middle element.
2. If x matches with middle element, we return the mid index.
3. Else If x is greater than the mid element, then x can only lie in right half subarray after the mid
element. So we recur for right half.
4. Else (x is smaller) recur for the left half.
----O(1) in case of iterative implementation. In case of recursive implementation, O(Logn) recursion
call stack space.
---- time complexity is 0( log n )
⚫ Merge Sort
Merge Sort is a Divide and Conquer algorithm. It divides input array in two halves, calls itself for the two
halves and then merges the two sorted halves.
MergeSort(arr[], l, r)
If r > l
1. Find the middle point to divide the array into two halves:
middle m = (l+r)/2
2. Call mergeSort for first half:
Call mergeSort(arr, l, m)
3. Call mergeSort for second half:
Call mergeSort(arr, m+1, r)
4. Merge the two halves sorted in step 2 and 3:
Call merge(arr, l, m, r)
Time complexity of Merge Sort is O ( n log n).
Auxiliary Space: O(n)
Algorithmic Paradigm: Divide and Conquer
⚫ QUICK SORT
QuickSort is a Divide and Conquer algorithm. It picks an element as pivot and partitions the given array
around the picked pivot.
The key process in quickSort is partition(). Target of partitions is, given an array and an element x of
array as pivot, put x at its correct position in sorted array and put all smaller elements (smaller than x)
before x, and put all greater elements (greater than x) after x. All this should be done in linear time.
/* This function takes last element as pivot, places
the pivot element at its correct position in sorted
array, and places all smaller (smaller than pivot)
to left of pivot and all greater elements to right
of pivot */
Worst-case
O(n2)
performance
⚫ Priority QUEUE :
Priority Queue is an extension of queue with following properties.
1. Every item has a priority associated with it.
2. An element with high priority is dequeued before an element with low priority.
3. If two elements have the same priority, they are served according to their order in the queue
In a Binary Heap, getHighestPriority() can be implemented in O(1) time, insert() can be implemented in
O(Logn) time and deleteHighestPriority() can also be implemented in O(Logn) time.
⚫ HEAP SORT :
Heap sort is a comparison based sorting technique based on Binary Heap data structure. It is similar to
selection sort where we first find the maximum element and place the maximum element at the end. We
repeat the same process for the remaining elements.
Heap Sort Algorithm for sorting in increasing order:
1. Build a max heap from the input data.
2. At this point, the largest item is stored at the root of the heap. Replace it with the last item of the heap
followed by reducing the size of heap by 1. Finally, heapify the root of the tree.
3. Repeat step 2 while size of heap is greater than 1.
Time complexity of heapify is O(Logn). Time complexity of createAndBuildHeap() is O(n) and overall
time complexity of Heap Sort is O(nLogn).
⚫ HEAP :
1. Max-Heap: In a Max-Heap the key present at the root node must be greatest among the keys
present at all of it’s children. The same property must be recursively true for all sub-trees in that
Binary Tree.
2. Min-Heap: In a Min-Heap the key present at the root node must be minimum among the keys
present at all of it’s children. The same property must be recursively true for all sub-trees in that
Binary Tree.
Process of Deletion:
Since deleting an element at any intermediary position in the heap can be costly, so we can simply
replace the element to be deleted by the last element and delete the last element of the Heap.
➢ Replace the root or element to be deleted by the last element.
➢ Delete the last element from the Heap.
➢ Since, the last element is now placed at the position of the root node. So, it may not follow the
heap property. Therefore, heapify the last node placed at the position of root.
Process of Insertion: Elements can be inserted to the heap following a similar approach as discussed
above for deletion. The idea is to:
• First increase the heap size by 1, so that it can store the new element.
• Insert the new element at the end of the Heap.
• This newly inserted element may distort the properties of Heap for its parents. So, in order to
keep the properties of Heap, heapify this newly inserted element following a bottom-up
approach.
⚫ Binary Search Tree is a node-based binary tree data structure which has the
following properties:
1. The left subtree of a node contains only nodes with keys lesser than the node’s key.
2. The right subtree of a node contains only nodes with keys greater than the node’s key.
3. The left and right subtree each must also be a binary search tree.
The worst case time complexity of search and insert operations is O(h) where h is height of Binary Search
Tree. In worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree
may become n and the time complexity of search and insert operation may become O(n).
Deletion : Node to be deleted has two children: Find inorder successor of the node. Copy contents of the
inorder successor to the node and delete the inorder successor. Note that inorder predecessor can also be
used.
The worst case time complexity of delete operation is O(h) where h is height of Binary Search Tree. In
worst case, we may have to travel from root to the deepest leaf node. The height of a skewed tree may
become n and the time complexity of delete operation may become O(n).
⚫ GREEDY ALGO :
Greedy is an algorithmic paradigm that builds up a solution piece by piece, always choosing the next piece
that offers the most obvious and immediate benefit. So the problems where choosing locally optimal also
leads to global solution are best fit for Greedy.
The greedy choice is to always pick the next activity whose finish time is least among the remaining
activities and the start time is more than or equal to the finish time of previously selected activity. We
can sort the activities according to their finishing time so that we always consider the next activity as
minimum finishing time activity.
1) Sort the activities according to their finishing time
2) Select the first activity from the sorted array and print it.
3) Do following for remaining activities in the sorted array.
…….a) If the start time of this activity is greater than or equal to the finish time of previously selected
activity then select this activity and print it.
2) Time Complexity : It takes O(n log n) time if input activities may not be sorted. It takes O(n) time
when it is given that input activities are always sorted.
Huffman Coding
Huffman Coding is a loss-less compression technique. It assigns variable-length bit codes to different
characters. The Greedy Choice is to assign least bit length code to the most frequent character.
Prefix Codes, means the codes (bit sequences) are assigned in such a way that the code assigned to one
character is not the prefix of code assigned to any other character. This is how Huffman Coding makes
sure that there is no ambiguity when decoding the generated bitstream.
Steps to build Huffman Tree
Input is an array of unique characters along with their frequency of occurrences and output is Huffman
Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is
used as a priority queue. The value of frequency field is used to compare two nodes in min heap.
Initially, the least frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with a frequency equal to the sum of the two nodes frequencies. Make the
first extracted node as its left child and the other extracted node as its right child. Add this node to the
min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node
and the tree is complete
Time complexity: O(nlogn) where n is the number of unique characters. If there are n nodes,
extractMin() is called 2*(n – 1) times. extractMin() takes O(logn) time as it calles minHeapify().
Prim’s algorithm
In Prim’s algorithm also, we create a MST by picking edges one by one. We maintain two sets: a set of
the vertices already included in MST and the set of the vertices not yet included. The Greedy Choice is
to pick the smallest weight edge that connects the two sets