ADSA Lecture Notes
ADSA Lecture Notes
Regulation: R23
UNIT – I:
Introduction to Algorithm Analysis, Space and Time Complexity analysis, Asymptotic
Notations. AVL Trees – Creation, Insertion, Deletion operations and Applications. .B-Trees –
Creation, Insertion, Deletion operations and Applications.
UNIT – II:
Heap Trees (Priority Queues) – Min and Max Heaps, Operations and Applications.
Graphs – Terminology, Representations, Basic Search and Traversals, Connected
Components and Biconnected Components, applications.
Divide and Conquer: The General Method, Quick Sort, Merge Sort, Strassen’s matrix
multiplication, Convex Hull.
UNIT – III:
Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem,
Minimum cost spanning trees, Single Source Shortest Paths.
Dynamic Programming: General Method, All pairs shortest paths, Single Source Shortest
Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search Trees, 0/1
Knapsack, String Editing, Travelling Salesperson problem.
UNIT – IV:
Backtracking: General Method, 8-Queens Problem, Sum of Subsets problem, Graph
Coloring, 0/1 Knapsack Problem.
Branch and Bound: The General Method, 0/1 Knapsack Problem, Travelling Salesperson
problem.
UNIT – V:
NP Hard and NP Complete Problems: Basic Concepts, Cook’s theorem.
NP Hard Graph Problems: Clique Decision Problem (CDP), Chromatic Number Decision
Problem (CNDP), Traveling Salesperson Decision Problem (TSP).
NP Hard Scheduling Problems: Scheduling Identical Processors, Job Shop Scheduling.
Text Books
Reference Books
1. Data Structures and program design in C, Robert Kruse, Pearson Education Asia.
2. An introduction to Data Structures with applications, Trembley & Sorenson,
McGrawHill.
3. The Art of Computer Programming, Vol.1: Fundamental Algorithms, Donald E
Knuth, Addison-Wesley, 1997.
4. Data Structures using C & C++: Langsam, Augenstein&Tanenbaum, Pearson, 1995.
5. Algorithms + Data Structures & Programs:, N.Wirth, PHI.
6. Fundamentals of Data Structures in C++: Horowitz Sahni& Mehta, Galgottia Pub.
7. Data structures in Java:, Thomas Standish, Pearson Education Asia.
1. https://www.tutorialspoint.com/advanced_data_structures/index.asp
2. http://peterindia.net/Algorithms.html
3. Abdul Bari,1. Introduction to Algorithms (youtube.com)
***
UNIT – I
ALGORITHM
Input: Values submitted for the processing the instructions are known as input
values. Here, zero or more quantities are externally supplied.
Output: Values generated after processing the instructions are known as output
values. Here, at least one quantity must be produced.
Definiteness: Each instruction is in clear format.
Example: Add 10 to X [ Valid Statement ]
Add 10 or 20 X [ Invalid Statement ]
Finiteness: If we trace out the instructions, the algorithm must terminate after a
finite sequence of steps.
Effectiveness: Every instruction must be in basic format.
Step 1: START
Step 2: READ x, y
Step 3: sum ← x + y
Step 4: WRITE sum
Step 5: STOP
The study of algorithms includes four important active areas such as:
a) How to devise algorithms: Algorithms are designed by using design strategies like
Divide-and-Conquer strategy, Greedy method, Dynamic programming, Branch and
bound etc,
d) How to test a program: Testing of a program consists of two phases: debugging and
profiling.
Debugging is the process of detecting and correcting the errors while compiling the
programs for proper execution.
Profiling/Performance measurement is the process of executing a correct program
on data sets and measuring the time and space it takes to compute the results.
Apriori analysis
Posterior analysis.
Apriori analysis means, the analysis is done for the problem before it run on the
machine. Posterior analysis is done after running the problem in a specific programming
language.
The time complexity of an algorithm is the amount of computer time it needs to run
for its completion. The space complexity of an algorithm is the amount of memory it needs
to run for its completion.
These complexities are calculated based on the size of the input. With this, analysis
can be divided into three cases as: Best case analysis, Worst case analysis and Average case
analysis.
Best case analysis: In best case analysis, problem statement takes minimum
number of computations for the given input parameters.
SPACE COMPLEXITY
The process of estimating the amount of memory space to run for its completion is
known as space complexity.
Space complexity S(P) of any problem P is sum of fixed space requirements and
variable space requirements as:
1. Fixed space that is independent of the characteristics (Ex: number, size) of the input
and outputs. It includes the instruction space, space for simple variables and fixed-
size component variables, space for constants and so on.
2. Variable space that consists of the space needed by component variables whose size is
dependent on the particular problem instance being solved, the space needed by the
referenced variables and the recursion stack space.
When analyzing the space complexity of any problem statement, concentrate solely
on estimating the variable space requirements. First determine which instance characteristics
to use to measure the space requirements. Hence, the total space requirement S(P) of any
program can be represented as:
S (P) = C + SP (I)
Where,
C is a constant representing the fixed space requirements and I refer to
instance characteristics.
Here, the instance characteristic is n. Variable terms are n, i and s. So that count
values are treated as n for the list array, one each for n, i, and s.
Therefore Ssum (n) = n + 3.
Here, the instance characteristic is n. Recursion includes space for formal parameters,
local variables, and the return address. So that count values are treated as one for n, one for
return address and one for K[ ] array. Function works for n+1 times. For every call, the three
word counts should be invoked.
TIME COMPLEXITY
The process of estimating the amount of computing time to run for its completion is
known as time complexity.
The time T(P) taken by a program P is the sum of its compile time and its run time.
Here,
Compile time is a fixed component and does not depends on the instance
characteristics. Hence,
T(P) ≥ TP(I)
Where, I refer instance characteristic.
In Step count method, determine the number of steps that a program or a function
needs to solve a particular instance by creating a global variable count, which has an initial
value of ‘0’. Now increment count by number of program steps required by each executable
statement. Finally add the total number of times that the count variable is incremented.
Here, count variable is incremented by twice one for addition operation and one for
return statement.
Therefore Tsum = 2.
Here, inside the loop count variable is incremented by two times and the loop is
executed for n times so that it becomes 2n steps. Outside the loop count is incremented by 3
steps.
-
-
Therefore TRsum = 2n + 2.
First determine the step count of each statement known as steps/execution simply s/e.
Note down the number of times that each statement is executed known as frequency.
The frequency of a non-executable statement is 0.
Multiply s/e with frequency, gives us the total steps for each statement.
Finally, add these total steps, gives us the step count of the entire function.
Example 1:
Example 2:
Example 3:
Based on the size of input requirements, complexities can be varied. Hence, exact
representation of time and space complexities is not possible. But they can be shown in some
approximate representation using mathematical notations known as asymptotic notations.
ASYMPTOTIC NOTATIONS
The function f(n) = O(g(n)) iff there exist two positive constants c and n0 such that
f(n) ≤ c * g(n) for all n, n ≥ n0.
The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y
c*g(n)
Here, the functional value f(n) is always below the
estimated functional value c*g(n). Thus, the function n0
g(n) acts as upper bound value for the function f(n).
Hence, Big ‘Oh’ notation is treated as “Upper bounded f(n)
Function”.
f(n)
n X
Example:
From this,
c=4 g(n) = n and n0 = 2
Hence, the function 3n+2 = O(n) iff there exist two positive constants 4 and 2 such
that 3n+2 ≤ 4n for all n, n ≥ 2.
Example: The function n2 + n + 3 = O(n2) iff there exist two positive constants 2 and 3
such that n + n + 3 ≤ 2n2 for all n, n ≥ 3.
2
In these complexities,
O(1) means constant
O(n) means linear
O(log n) means logarithmic
O(n2) means quadratic
3
O(n ) means cubic
O(2n) means exponential.
O(1) < O(log n) < O(n) < O(n logn) < O(n2) < O(n3) < - - - - - - - < O(2n) .
The function f(n) = Ω(g(n)) iff there exist two positive constants c and n0 such that
f(n) ≥ c * g(n) for all n, n ≥ n0.
The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y f(n)
Example: n X
From this,
c=3 g(n) = n and n0 = 1
Hence, the function 3n+2 = Ω (n) iff there exist two positive constants 3 and 1 such
that 3n+2 ≥ 3n for all n, n ≥ 1.
The function f(n) = Ө(g(n)) iff there exist three positive constants c1, c2 and n0 such
that c1 * g(n) ≤ f(n) ≤ c2 * g(n) for all n, n ≥ n0.
The graphical representation between n values on X-axis and f(n) values on Y-axis is as
follows:
Y
c2 * g(n)
Example:
From this,
c1 = 3 c2 = 4 g(n) = n and n0 = 2
Hence, the function 3n+2 = Ө(g(n)) iff there exist three positive constants 3,4 and 2
and n0 such that 3n ≤ 3n+2 ≤ 4n for all n, n ≥ 2.
The function f(n) = o(g(n)) iff there exist two positive constants c and n0 such that
f(n) < c * g(n) for all n, n ≥ n0.
(OR)
𝑓(𝑛)
The function f(n) = o(g(n)) iff Lim = 0
𝑔(𝑛)
n→∞
The function f(n) = ω(g(n)) iff there exist two positive constants c and n0 such that
f(n) > c * g(n) for all n, n ≥ n0.
OR
𝑔(𝑛)
The function f(n) = ω(g(n)) iff Lim = 0
𝑓(𝑛)
n→∞
***
Here, AVL tree and Red-Black trees are useful for internal memory applications
whereas B-tree is useful for external memory applications.
i) AVL Tree
AVL tree is a height balanced tree introduced in 1962 by Adelson-Velskii and Landis.
“An empty binary tree T is an AVL tree. If T is non-empty binary tree with TL and TR
as its left and right subtrees, then T is an AVL tree if:
In an AVL tree every node is associated with a value called balance factor and it is
denoted for the node x as:
From the definition of AVL tree, the allowable balance factors are 0, 1 and -1.
Example:
0
20
0 -1
15 40
0 0 16 0
90 77
Note: If the AVL tree satisfies the properties of binary search tree, then it is referred to as an
AVL search tree.
Insertion Operation:
Inserting an element into an AVL search tree follows the same procedure as the
insertion of an element into a binary search tree. But, the insertion may leads to a situation
where the balance factors of any of the nodes may be other than -1, 0, 1 and the tree is
unbalanced.
If the insertion makes an AVL search tree unbalanced, the height of the subtree must
be adjusted by the operations called rotations.
For this, consider N is the newly inserted node and A is the nearest ancestor which
has balance factor as 2 or -2. Then, the imbalance rotations are classified into four types as:
Here, the transformations done to LL and RR imbalances are often called single
rotations, while those done for LR and RL imbalances are called double rotations.
LL Rotation: In LL rotation, every node moves one position from the current position in
clockwise direction.
A
B
LL Rotation
B
N A
N
RR Rotation: In RR rotation, every node moves one position from the current position in
anti clockwise direction.
A
B
RR Rotation
B
A N
N
LR Rotation: The LR Rotation is a sequence of single left rotation followed by a single right
rotation. In LR Rotation, at first, every node moves one position to the left and one position
to right from the current position. i.e., LR Rotation = RR Rotation + LL Rotation.
A A
RR LL
B N
N
N
B A
B
RL Rotation: The RL Rotation is sequence of single right rotation followed by single left
rotation. In RL Rotation, at first every node moves one position to right and one position to
left from the current position. i.e., RL Rotation = LL Rotation + RR Rotation.
A A
N
LL RR
B N
A
B B
N
Deletion Operation:
Deletion of an element from an AVL search tree follows the same procedure as the
deletion of binary search tree operations. Due to deletion of the node, some or all of the
nodes of balance factors on the path might be changed and tree becomes unbalanced. To
make it balanced format, it also requires rotations.
In this case, the deleting node is not available after performing the deletion operation.
Here, based on the balanced factors siblings of the deleted nodes rotations are classified into
six types as L0 , L1 , L-1 and R0 , R1 , R-1 rotations.
R0 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as 0, and then R 0 rotation is used to
rebalance the tree as:
C
B
R0 Rotation
0 B C1R BL C
BL BR BR C1R
R1 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as 1, and then R 1 rotation is used to
rebalance the tree as:
C
B
R1 Rotation
1 B C1R BL C
BL BR BR C1R
R-1 Rotation: Assume a node is deleted from the right subtree of a specific position C. After
deletion operation, the sibling node B has a balance factor as -1, and then R-1 rotation is used
to rebalance the tree as:
C B
R-1 Rotation
-1 A C1R C
A
AL B
AL BL BR C1R
BL BR
L0 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as 0, and then L0 rotation is used to
rebalance the tree as:
B
C
L0 Rotation
B1L 0 C CR
B
CL CR B1L CL
L1 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as 1, and then L 1 rotation is used to
rebalance the tree as:
B
C
L1 Rotation
B1L 1 D
D
B
C DR B1L CL CR DR
CL CR
L-1 Rotation: Assume a node is deleted from the left subtree of a specific position B. After
deletion operation, the sibling node C has a balance factor as -1, and then L-1 rotation is used
to rebalance the tree as:
B
C
L0 Rotation
B 1
-1 C CR
L
B
CL CR B1L CL
40 85
75 99
ii) B-TREE
An m-way search tree T may be an empty tree. If T is non-empty then it satisfies the
properties as:
20 40
0
10 15 25 30 45 50
0 0
28
B-TREE Definition
The root node should have a minimum of two children and a maximum of m children.
All the internal nodes except the root node should have a minimum of ┌ m/2 ┐ non-
empty children and a maximum of m non-empty children.
All the external nodes are at the same level.
43 75
6 24 52 64 87
Note:
1. B-tree of order 3 is also referred to as 2-3 trees. Since, its internal nodes can have
only two or three children.
2. B-tree of order 4 is also referred to as 2-3-4 or 2-4 trees. Since, its internal nodes can
have either two, three or four children.
OPERATIONS ON B-TREE
Insertion Operation:
Inserting a new element into the B-tree of order m is followed by the search operation
for the proper location in a node. When the search terminates at a particular node, then
insertion falls into the either of the cases as:
Case-1: In node X contains space for insertion, then inserts the element in proper
position and child pointers are adjusted accordingly.
40 82
11 25 38 58 74 86 89 93 97
Insertion (64):
40 82
11 25 38 58 64 74 86 89 93 97
Case-2: If node X contains full of elements, then first insert the element into its list of
elements. Then split the node into two sub nodes at the median value. The elements that are
less than the median becomes the left node and that are greater than the median becomes the
right node. Then the median element is shifted up into the parent node of X. Sometimes the
process may propagate up to root level also.
40 82
11 25 38 58 74 86 89 93 97
Insertion (99):
40 82 93
11 25 38 58 74 86 89 97 99
Deletion Operation:
Case-1: When the key exists in the leaf node and deletion may not effect of the
properties, then simply delete the key from the node and child pointers are adjusted.
40 82
11 25 38 58 74 86 89 93 97
Deletion (89):
40 82
11 25 38 58 74 86 93 97
Case-2: When the key exists in the non-leaf node, replace key with the largest element
from its left sub-tree or the smallest element from its right sub-tree.
40 82
11 25 38 58 74 86 89 93 97
Deletion (40):
82
11 25 38 58 74 86 89 93 97
38 82
11 25 58 74 86 89 93 97
Case-3: If deleting an element k from a node leaves it with less than its minimum
number of elements, then elements can be borrowed from either of its sibling nodes. If the
left sub tree node is capable to spare the element then its largest element is shifted into the
parent node. If the right sub tree node is capable to space the element then its smallest
element is shifted into the parent node. From the parent node the intervening element is
shifted down to fill the vacancy created by the deleted element.
40 82
11 25 38 58 74 86 89 93 97
Deletion (58):
40 82
11 25 38 74 86 89 93 97
38 82
11 25 40 74 86 89 93 97
Case-4: If deletion of an element is making the elements of the node to be less than its
minimum number and either of the sibling nodes have no chance of sparing an element, then
this node is merged with either of the sibling nodes including the intervening element from
the parent node.
38 82
11 25 40 74 86 89 93 97
Deletion (25):
82
11 38 40 74 86 89 93 97
END
UNIT – II
Heap Trees (Priority Queues) – Min and Max Heaps, Operations and Applications.
Graphs – Terminology, Representations, Basic Search and Traversals, Connected
Components and Biconnected Components, Applications.
Divide and Conquer: The General Method, Quick Sort, Merge Sort, Strassen’s Matrix
Multiplication, Convex Hull.
***
HEAP TREE
Suppose H is a complete binary tree. Then it will be termed as a Heap Tree / Max
Heap if it satisfies the property as:
For each node N in H, the value at N is greater than or equal to the value of each of
the children of N.
Example:
95
84 48
5
76 23
In addition to this a Min Heap is possible, where the value at N is less than or equal to the
value of each of the children of N.
Example:
25
45 78
5
83 71
Heap tree is a complete binary tree, it is better to represent with a single dimensional
array. In this case, there is no wastage of memory space between two non-null entries.
Example:
95
Array Representation
84 48
1 2 3 4 5 6
5
95 84 48 76 23
76 23
Insertion into a heap tree: This operation is used to insert a new element into a
heap tree. Let K is an array that stores n elements of a heap tree. Assume an element of
information is given in the variable ‘Key’ for insertion. Then insertion procedure works as:
First adjoin key value at the end of K so that still the tree is a complete binary tree, but
not necessarily a heap.
Then raise the key value to its appropriate position so that finally it is a heap tree.
The basic principle is that first add the data element into the complete binary tree.
Then compare it with the data in its parent node; if the value is greater than then the parent
node then interchange these two values. This procedure will continue between every two
nodes on the path from the newly inserted node to the root node until we get a parent whose
value is greater than its child or we reached at the root node.
140
85 45
75 25 35 15
55 65
Algorithm InHeap (Key): Let K is an array that stores a heap tree with ‘n’
elements. This procedure is used to store a new element Key into the heap tree.
Step 1: n : = n+1;
Step 2: K[n] : = Key;
Step 3: i : = n;
p : = i/2;
Step 4: Repeat WHILE p > 0 AND K[p] < K[i]
Temp : = K[i];
K[i] : = K[p];
K[p] : = Temp;
i : = p;
p : = p/2;
End Repeat
Step 5: Return
140
85 45
75 25 35 15
55 65
Algorithm Deletion ( ): This procedure removes root element of the heap tree
and rearranges the elements into heap tree format.
Two important applications of heap trees are: Heap sort and Priority queue
implementations.
Heap Sort: Heap sort is also known as Tree sort. The procedure of heap sort is as
follows:
Step 1: Build a heap tree with the given set of data elements.
Step 2: a) Delete the root node from the heap and place it in last location.
b) Rebuild the heap with the remaining elements.
Step 3: Continue Step-2 until the heap tree is empty.
Example: Sort the elements 33, 28, 65, 12 and 99 using heap sort.
Note: The worst case time complexity of heap sort is O(n logn) time.
GRAPHS
A graph G=(V,E) consists of a finite non-empty set of vertices V also called as nodes
and a finite set of edges E also called as arcs.
Example:
e1
a b
e2 e3
e5
c d
e4
GRAPH TERMINOLOGY
Digraph: A graph in which every edge is directed is called a digraph. A digraph is also
known as a directed graph.
Example:
e1
a b
e2 e3
c d
e4
Example: e1
a b
e3
e2 c
Mixed Graph: A graph in which some edges are directed and some edges are
undirected is called a mixed graph.
Example: e1
a b
e2 e4
c d
e3
Weighted Graph: A graph is termed as a weighted graph if all the edges in it are labeled
with some weight values.
Example: 5
a b
7
3
c d
9
Self Loop: If there is an edge whose starting and ending vertices are same is called
as a loop or self loop.
Example:
a
Self loop at vertex c
e2 e4
b c e1
e3
In-degree of a Vertex: The number of edges coming into the vertex Vi is called the in-
degree of vertex Vi.
Out-degree of a Vertex: The number of edges going out from a vertex Vi is called the
out-degree of vertex Vi.
Degree of a Vertex: Sum of out-degree and in-degree of a node V is called the total
degree of the node V and is denoted by degree (V).
Example:
a In-degree (a) = 1
Out-degree (a) = 2
b d
V4
V1
V3
Complete Graph:
Example:
V2
V4
V1
V3
Note: An n-vertex, undirected graph with exactly n(n-1) / 2 edges is said to be complete
graph.
Connected Graph:
Example: V2
V1 V4
V3
Example: V1
V4 V2
V3
Acyclic Graph: If there is a path containing one or more edges which starts from a
vertex vi and terminates into the same vertex then the path is known as a cycle. If a graph
does not have any cycle then it is called as acyclic graph.
Example: V1
V4 V2
V3
Sub Graph: A sub graph of G is a graph G1 such that V(G1) is a subset of V(G) and
1
E(G ) is a sub set of E(G).
0
1 2
0
0 0
1 2
1 2
GRAPH REPRESENTATIONS
A graph can be represented in many ways. Some of the important representations are:
Set representation, Adjacency matrix representation, Adjacency list representations
etc.,
Set representation:
One of the straight forward methods of representing any graph is set representation.
In this method two sets are maintained: V as the set of vertices and E as the set of edges
which is a subset of V x V.
In case of weighted graph, V as the set of vertices and E as the set of edges which is a subset
of W x V x V.
V1
Example:
V(G) = {V1,V2,V3,V4,V5,V6,V7} V3
V2
E(G) = { (V1,V2), (V1,V3), (V2,V4),
(V2,V5), (V3,V4), (V3,V6), V4
(V4,V7), (V5,V7), (V6,V7) }
V6
V5
V7
Example:
5 V(G) = {A, B, C, D}
A B E(G) = { (3,A,C), (5,B,A),
(1,B,C), (7,B,D), (2,C,A),
2 3 1 6 7 (4,C,D), (6,D,B), (8,D,C)}
4
C D
8
Linked representation:
Node Adjacency
Node structure of Non-weighted graph:
Information List
In linked representation, the number of lists depends on the number of vertices in the
graph.
Example:
V1 V2 V3
V1
V2 V1 V4 V5
V3
V2
V3 V1 V4 V6
V4
V4 V2 V3 V7
V6 V5 V2 V7
V5
V6 V3 V7
V7
V7 V4 V5 V6
Example:
5
A B
2 3 1 6 7
4
C D
8
A 3 C
B 1 C 7 D 5 A
C 2 A 4 D
D 6 B 8 C
Sequential representation:
Sequential (Matrix) representation is the most useful way for representing any graphs.
For this, different matrices are allowed as Adjacency matrix, Incidence matrix, Circuit
matrix, Cut set matrix, Path matrix etc.
Example: V1 V2 V3 V4
V3 0 1 1 0
1 0 1 0
V4
A = 1 1 0 1
V1
0 0 1 0
V2
Example:
V3 0 1 1 1
0 0 0 1
V4
A = 0 1 0 1
V1
0 0 0 0
V2
Traversing a graph means visiting all the vertices in the graph exactly once. Graph
traversal techniques are:
a) Breadth First Search (BFS) Traversal
b) Depth First Search (DFS) Traversal
The traversal starts from a vertex u which is said to be visited. Now all nodes V i
adjacent to u are visited. The unvisited vertices Wij adjacent to each of Vi are visited next
and so on. This process continues till all the vertices of the graph are visited.
BFS uses a queue data structure that keep a track of order of nodes whose adjacent
nodes are not to be visited.
Algorithm BFS (u): This procedure visits all the vertices of the graph starting from the
vertex u.
Step 1: Initialize a queue as Q
Visited(µ) : = 1;
Enqueue(Q, u)
Step 2: Repeat WHILE NOT EMPTY(Q)
Dequeue(Q,u);
WRITE (u);
For all vertices V adjacent to u
IF Visited (V) = 0 THEN
Enqueue(Q,V);
Visited (V) : = 1;
ENDIF
EndFor
EndRepeat
Step 3: RETURN
5 4
2
1
9
7
3
6
10
Analysis: For each vertex is placed on the queue exactly once, the while loop is iterated
at most n times. For the adjacency list representation, this loop has a total cost of O(e). For
adjacency matrix representation, the while loop takes O(n) time for each vertex visited.
Therefore, the total time is O(n2).
The traversal starts from a vertex u which is said to be visited. Now all nodes V i
adjacent to u are collected and the first adjacent vertex V j is visited. The nodes adjacent to Vj
namely Wk are collected and the first adjacent vertex is visited. The traversal progresses until
there are no more visits possible.
DFS uses a stack data structure that keep a track of order of nodes whose adjacent
nodes are not to be visited.
Algorithm DFS (u): This procedure visits all the vertices of the graph starting from the
vertex u.
Step 1: Visited(u) : = 1;
WRITE (u);
Step 2: For each vertex V adjacent to u
IF Visited (V) = 0 THEN
Call DFS(V);
ENDIF
EndFor
Step 3: RETURN
A C E
Analysis: For the adjacency list representation, each node determines the links search
operation is O(e) time. For adjacency matrix representation, the determining all vertices
adjacent to the vertex requires O(n) time. Therefore, the total time is O(n2).
CONNECTED COMPONENTS
A connected component of an undirected graph G is a maximal connected subgraph
of G.
Example:
1 1
2 3 2 3
4 4
If the graph is connected undirected graph, then we can visit all the vertices of the
graph by using either breadth first search or depth first search. The subgraph which has been
obtained after traversing the graph using either BFS or DFS represents the connected
component of the graph.
Example:
0
1 2
3 5 6
4
0
0
1 2
1 2
4
3 3 5 6
5 6 4
7
7
void components(G, n)
{
int i;
for(i =0 ; i<n; i++)
Visited[i] = 0;
for(i=0 ; i<n; i++)
{
if(Visited[i] = = 0)
dfs(i);
}
}
Analysis:
1. If the graph G is represented by its adjacency lists, then the total time needed to
generate all the connected components is O(n+e) time.
2. If the graph G is represented by its adjacency matrix, then the total time needed to
generate all the connected components is O(n2) time.
BICONNECTED COMPONENTS
Examples:
Example:
1
5
2
4
3
Example:
1 0 3
2 4
1 0 0 3 3
2 4
APPLICATION OF GRAPHS
Graphs are used to represent networks, road maps, facebook etc. In addition to this,
graphs are used in different application areas which include:
Algorithm DAndC ( P )
{
if Small ( P ) then
return S ( P ) ;
else
{
Divide P into small instances P1 , P2 , - - - - , PK
Apply DAndC to each these sub problems
return Combine(DAndC(P1) , DAndC(P2) , - - - , DAndC(PK))
}
Where,
If the size of P is n and the sizes of the K sub problems are n1, n2, - - - , nK respectively
then the computing time of DAndC is described by the recurrence relation
Where,
T(n) is the time for DAndC on any input of size n and g(n) is the time to compute the
answer directly for small inputs.
The function f(n) is the time for dividing P and combining the solutions to sub
problems.
Complexities of many divide and conquer algorithms is given by the recurrence relation
of the form:
Here,
a and b are known constants, T(1) is known and n is a power of b.
T(n) = 2 T(n/2) + n
= 2 [ 2 T(n/4) + n/2 ] + n
= 4 T(n/4) + n + n
= 4 T(n/4) + 2n
= 4 [ 2 T(n/8) + n/4 ] + 2n
= 8 T(n/8) + 3n
-
-
-
= 2i T(n/2i) + in for any log 2 n ≥ i ≥ 1.
Let 2i = n
log2 2i = log2 n
i = log2 n
From this,
T(n) = 2 n + n log 2 n
Applications
Let K is an array that consists of ‘n’ elements from index 1 to index n. Sorting refers
to the process of rearranging the given elements of K in ascending order such that: K[1] ≤
K[2] ≤ . . . . . . . . ≤ K[n]. For this quick sort procedure works as:
The above process refers to one pass. At the end of the pass, the pivot element is
positioned at its sorted position. At this stage, the elements before the pivot element are less
than or equal to pivot element and after the pivot element are greater than or equal to the
pivot element.
Now, the same procedure is repeated on the elements before the pivot element as well
as on the elements after the pivot element.
When all passes are completed, then list of array elements are available in sorted
order.
ALGORITHM
Algorithm QSort(K , P, Q): Let K is an array that consists of ‘n’ elements from
index 1 to index n. Assume P refers to the first index 1 and Q refers to the last index
n at the initial call.
This procedure splits the array into two sub arrays at every pass.
{
if ( P < Q ) then
{
j := Partition(K, P, Q);
QSort( K, P, j-1);
QSort( K, j+1, Q);
}
}
1. The Worst case time complexity of quick sort is O (n2). It occurs when the list of
elements already in sorted order.
Hence, f(0) = 0
f(1) = 0.
f(n) = f(n-1) + n
f(n-1) = f(n-1-1) + n – 1 = f(n-2) + n – 1
f(n-2) = f(n-2-1) + n – 2 = f(n-3) + n – 2
.
.
.
f(1) = f(0) + 1
f(0) = 0
Therefore,
f(n) = n + (n-1) + (n-2) + - - - - - - - - + 1 + 0
= (n(n+1)) / 2
= (n2+n)/2
= (n2 / 2) + (n / 2)
= O(n2)
Hence,
Worst case time complexity of Quick sort is O(n2) time.
2. The Average case time complexity of quick sort is O(n logn), which is less
compare to worst case time complexity.
In this case, the pivot element is positioned at one appropriate location and splits the
array elements into two sub arrays. From this, the recurrence relation is given by the function
1 2
MERGE SORT
First divide the array elements into two sub arrays based on
𝐿𝑜𝑤+𝐻𝑖𝑔ℎ
Mid =
2
Where,
Low is the first index of the array and High is the last index of the array.
Once, the sub arrays are formed, each set is individually sorted and the resulting sub
sequences are merged to produce a single sorted sequence of data elements.
Divide-and-Conquer strategy is applicable as splitting the array into sub arrays; and
combining operation is merging the sub arrays into a single sorted array.
Merging is the process of combining two sorted lists into a single sorted list. While
performing merging operation, the two sub lists must be in sorted order.
ALGORITHM
// Let K is a global array that consists of ‘n’ elements from index 1 to index n.
// Low refers to the first index 1 and High refers to the last index n at the initial call.
{
if ( Low < High ) then
{
Mid : = (Low+High) / 2;
MSort(Low,Mid);
MSort(Mid+1,High);
Merge(Low,Mid,High);
}
}
// This procedure merges the two sub sorted arrays into a single sorted array.
{
h : = Low;
i : = Low;
j : = Mid+1;
while (( h ≤ Mid) AND (j ≤ High)) do
{
if ( K[h] ≤ K[j] ) then
{
S[i] : = K[h];
h : = h + 1;
}
else
{
S[i] : = K[j];
j : = j + 1;
}
i : = i + 1;
}
if ( h > Mid ) then
{
for p : = j to High do
{
S[i] : = K[p];
i : = i + 1;
}
}
else
{
for p : = h to Mid do
{
S[i] : = K[p];
i : = i + 1;
}
}
for p : = Low to High do
{
K[p] : = S[p];
}
}
Merge sort consists of several passes over the input. The first pass merges segments
of size1, second pass merges segments of size2, and so on. The computing time for merge
sort is described by the recurrence relation:
T(n) = 2 T(n/2) + c n
= 2 [ 2 T(n/4) + (c n) / 2 ] + c n
= 4 T(n/4) + 2 c n
= 4 [ 2 T(n/8) + (c n) / 4 ] + 2 c n
= 8 T(n/8) + 3 c n
-
-
-
= 2i T(n/2i) + i c n for any log 2 n ≥ i ≥ 1.
Let 2i = n
log2 2i = log2 n
i = log2 n
From this,
T(n) = O ( n log 2 n )
Therefore,
The worst case and average case time complexity of Merge sort is O ( n log 2 n )
time.
Note:
The main disadvantage of merge sort is its storage representation. In merge sort
technique, merge process required an auxiliary (temporary) array which has same size as the
original array. Hence, it requires more space compared to other sorting techniques.
The divide and conquer strategy suggest another way to compute the product of two
n X n matrices. Here, assume n is a power of 2. Now, the two matrices A and B are
partitioned into 4 sub matrices, each sub matrix having dimensions n/2 X n/2. Then the
product of AB can be computed by using the formula as:
A1 1 A1 2 B1 1 B1 2 C1 1 C1 2
=
A2 1 A2 2 B2 1 B2 2 C2 1 C2 2
Here,
C11 = A11 B11 + A12 B21
C12 = A11 B12 + A12 B22
C21 = A21 B11 + A22 B21
C22 = A21 B12 + A22 B22
T(n) = b ; n≤2
8 T(n/2) + c n2 ; n>2
T(n) = O ( n3 )
Volker Strassen has discovered a way to compute Cij `s using only 7 multiplications
and 18 additions or subtractions. His method involves first computing the seven n/2 X n/2
matrices P, Q, R, S, T, U and V as:
C11 = P+S–T+V
C12 = R+T
C21 = Q+S
C22 = P+R–Q+U
T(n) = b ; n≤2
7 T(n/2) + a n2 ; n>2
T(n) = O ( n2.81 )
Example: Calculate product of the given two matrices using Strassen’s matrix
multiplication where
A = 1 2 4 1 B = 1 2 3 4
2 3 2 4 4 3 2 1
1 5 1 2 1 3 1 2
3 1 4 2 4 1 2 3
CONVEX HULL
The convex hull of a set S of points in the plane is defined to be the smallest convex
polygon containing all points of S.
Note: A polygon is defined to be convex if for any two points p1 and p2 inside the polygon,
the directed line segment from p1 to p2 is fully contained in the polygon.
To obtain convex hull for the given set of points apply divide-and-conquer strategy using
Quick hull algorithm as:
The Quick Hull algorithm is a Divide and Conquer algorithm similar to Quick Sort. Let
a[0…n-1] be the input array of points. Following are the steps for finding the convex hull
of these points.
1. Find the point with minimum x-coordinate as min_x and similarly the point with
maximum x-coordinate as max_x.
2. Make a line joining these two points, say L. This line will divide the whole set into two
parts. Take both the parts one by one and proceed further.
3. For a part, find the point P with maximum distance from the line L. P forms a triangle
with the points min_x, max_x. It is clear that the points residing inside this triangle can
never be the part of convex hull.
4. The above step divides the problem into two sub-problems (solved recursively). Now
the line joining the points P and min_x and the line joining the points P and max_x are
new lines and the points residing outside the triangle is the set of points. Repeat point
no. 3 till there no point left with the line. Add the end points of this point to the convex
hull.
The average case time complexity of Quick hull algorithm is O(n log n) time.
END
UNIT – III
Greedy Method: General Method, Job Sequencing with deadlines, Knapsack Problem,
Minimum cost spanning trees, Single Source Shortest Paths.
Dynamic Programming: General Method, All pairs shortest paths, Single Source Shortest
Paths – General Weights (Bellman Ford Algorithm), Optimal Binary Search Trees, 0/1
Knapsack, String Editing, Travelling Salesperson problem.
***
GREEDY METHOD
Greedy method is a straightforward design technique applied on those methods in
which solution is in terms of a subset that satisfies some constraints.
Any subset that satisfies the constraints of the problem statement is called a “feasible
solution”.
A feasible solution that either maximizes or minimizes the given objective function is
called an “optimal solution”.
To obtain optimal solution, Greedy method suggests that to devise an algorithm that
works in stages, considering one input at a time. At each stage, a decision is made regarding
whether a particular input is I an optimal solution.
If the inclusion of next input into the partially constructed optimal solution will result
in an infeasible solution, then this input is not added to the partial solution. Otherwise, it is
added.
This procedure results the algorithm to generate suboptimal solutions. Hence, this
version of the algorithm is called the “Subset Paradigm”.
Algorithm Greedy ( a , n )
{
// a[1 : n] contains n inputs
solution : = Φ;
for i : = 1 to n do
{
x : = Select ( a );
if Feasible ( solution , x ) then
solution : = Union ( solution , x );
}
return solution;
}
Here,
The function Select( ) selects an input from a[ ] and is assigned to x.
Feasible( ) is a Boolean valued function that determines whether x can be included
into the solution vector.
Union( ) function combines x with the solution and updates the objective function.
APPLICATIONS
Knapsack problem
Job sequencing with dead lines
Minimum cost spanning trees
Single source shortest path etc,
KNAPSACK PROBLEM
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If a fraction xi, 0 ≤ xi ≤ 1, of object i is placed into the
knapsack, then a profit of pixi is earned. The objective is to obtain a filling of the knapsack
that maximizes the total profit earned.
maximize ∑ pi xi → 1
1≤i≤n
subject to ∑ wi xi ≤ m → 2
1≤i≤n
and 0 ≤ xi ≤ 1 , 1≤i≤n → 3
Algorithm GreedyKnapsack ( m , n )
{
// p[1 : n] and w[1 : n] contain the profits and weights respectively of n objects
such that p[i] / w[i] ≥ p[i+1] / w[i+1] ≥ - - - - - - - -.
for i : = 1 to n do
x[i] : = 0.0;
U : = m;
for i : = 1 to n do
{
if ( w[i] > U ) then
break;
else
{
x[i] : = 1.0;
U : = U – w[i];
}
}
if ( i ≤ n ) then
x[i] : = U / w[i];
}
Here,
If sum of all weights is ≤ m, then xi = 1 ; 1 ≤ i ≤ n is an optimal solution.
If p1 / w1 ≥ p2 / w2 ≥ - - - - ≥ pn / wn , then GreedyKnapsack generates an optimal
solution to the given instance of the Knapsack problem
Time complexity of Knapsack problem is O(n) time.
A feasible solution for this problem is a subset of jobs such that each job in this sub
set can be completed by its deadline. The value of the feasible solution J is the sum of
profits of the jobs as ∑ pi.
i€J
An optimal solution is a feasible solution with maximum profit value.
Example: Obtain optimal solution of Job sequencing with deadlines for the instance n=5,
( p1, p2, p3, p4, p5 ) = (20, 15, 10, 5, 1) and ( d1, d2, d3, d4, d5 ) = (2, 2, 1, 3, 3).
Algorithm GreedyJob ( d, J, n )
{
J : = { 1 };
for i : = 2 to n do
{
if ( all jobs in J U { i } can be completed by their deadlines ) then
J : = J U { i };
}
}
etc,
If the graph consists ‘n’ vertices, spanning tree contains exactly ‘n-1’ edges to connect
the ‘n’ vertices.
Consider a weighted graph in which every edge is labeled with a cost / weight value.
Cost of the spanning tree is obtained by adding the cost of the selected edges.
Most important methods used to find the minimum cost spanning trees are:
Prim’s Algorithm
Kruskal’s Algorithm
PRIM’s ALGORITHM
In this method, minimum cost spanning tree builds the tree edge by edge. The next
edge is included based on some optimization criterion. The simplest way is to choose an
edge that results in a minimum increase in the sum of the cost of the edges so far included.
Procedure:
Algorithm Prim( )
{
T:=Φ;
TV : = Φ ;
while ( T contains less than n-1 edges ) do
{
Let (u, v) be a least cost edge such that u € TV and v € TV;
if ( there is no edges ) then
break;
else
{
Add v to TV;
Add (u, v) to T;
}
}
}
KRUSKAL’s ALGORITHM
Kruskal’s algorithm builds a minimum-cost spanning tree T by adding the edges one
at a time. It selects the edges in non-decreasing order of their costs. An edge is added to T if
it does not create a cycle.
Procedure:
Algorithm Kruskal( )
{
T:=Φ;
while (( T has less than n-1 edges) and (E ≠ Φ )) do
{
Choose an edge (u,v) from E of Lowest cost;
Delete (u, v) from E;
if (u, v) does not create a cycle in T then
add (u,v) to T;
else
discard (u, v);
}
}
Note: The main difference between Prim’s and Kruskal’s algorithm is:
Prim’s algorithm maintains tree structure at every stage and Kruskal’s algorithm need
not be maintained.
Finally, both methods produces same spanning tree with minimum-cost.
Consider G = (V, E) be a weighted directed graph and source vertex as v0. The
problem is to determine the shortest path from v0 to all the remaining vertices of
G.
The greedy method generates shortest paths from vertex v0 to the remaining
vertices in non-decreasing order of path length.
First, a shortest path to the nearest vertex is generated. Then, a shortest path to the
second nearest vertex is generated and so on.
Example: Identify shortest paths of the following graph by starting vertex as ‘5’.
for i : = 1 to n do
{
S[i] : = False;
Dist[i] : = Cost[v, i];
}
S[v] : = True;
Dist[v] : = 0.0;
for num : = 1 to n-1 do
{
Choose u from among those vertices not in S such that Dist[u] is minimum;
S[u] : = True;
for ( each w adjacent to u with S[w] = False ) do
{
if ( Dist[w] > Dist[u] + Cost[u, w] ) then
Dist[w] : = Dist[u] + Cost[u, w];
}
}
}
***
DYNAMIC PROGRAMMING
Dynamic programming is an algorithm design method that can be used when the
solution to a problem statement can be viewed as the result of a sequence of decisions.
To obtain optical solution, first generate all possible sequences, and then apply
principle of optimality.
“The principle of optimality states that an optimal sequence of decisions has the
property that whatever the initial state and decisions are, the remaining decisions must
constitute an optimal decision sequence with respect to the state resulting from the firs
decision”.
Note: The essential difference between the Greedy method and the Dynamic programming
is, in the greedy method only one decision sequence is ever generated. In dynamic
programming, many decision sequences may be generated.
Applications: Some of the problem statements that can be solved using dynamic
programming are:
All pairs shortest paths
Single-source shortest paths
Optimal binary search trees
0/1 Knapsack problem etc,
Let G = (V, E) be a directed graph with n vertices. Let cost be a cost adjacency
matrix such that cost <i, i> = 0, 1 ≤ i ≤ n. Then cost <i, j> is the length of edge <i, j> if <i, j>
€ E(G) and cost <i, j> = α if <i, j> € E(G).
All-pairs shortest path problem is to determine of matrix A such that A(i, j) is the
length of a shortest path from i to j. The matrix A can be obtained by solving n-single source
shortest path problems.
for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = cost[i, j];
for k : = 1 to n do
for i : = 1 to n do
for j : = 1 to n do
A[i, j] : = min [ A[i, j] , A[i, k] + A[k, j] ] ;
}
Analysis:
Here, the first nested loop takes O(n2) time and second nested loop takes O(n3) time.
Hence, overall time complexity is O(n3) time.
Let distl[u] be the length of a shortest path from the source vertex v to vertex u that
contains at most l edges. Then,
The main objective is to compute distn-1[u] for all u. This can be done by using
dynamic programming methodology with Bellman and Ford algorithm as:
If the shortest path from v to u with at most k, k > 1, edges has no more than
k-1, then
distk [u] = distk-1 [u]
If the shortest path from v to u with at most k, k > 1, edges has exactly k, then
it is made up of a shortest path from v to some vertex j followed by the edge
<j,u>, then
distk [u] = min { distk-1 [i] + cost [ i, u ] }
i
From this, dist can be shown as:
Example: Assume the set of identifiers are: for, do, while, int, if.
for for
do while etc,
do int
int
if while
if
The first tree takes 1, 2, 2, 3, 4 comparisons to find the identifiers. Thus, the average
number of comparisons is (1 + 2 + 2 + 3 + 4) / 5 = 12/5.
The second tree takes 1, 2, 2, 3, 3 comparisons to find the identifiers. Thus, the
average number of comparisons is (1 + 2 + 2 + 3 + 3) / 5 = 11/5.
The number of comparisons at different levels to search a particular node is
considered as a cost.
For a given set of identifiers, design a binary search tree with minimum cost is known
as “Optimal Binary Search Tree”.
∑ p(i) + ∑ q(i) = 1
1≤i≤n 0 ≤i≤n
To obtain cost function of binary search trees, it is useful to add an external node at
every empty subtree indicated with square nodes.
for
do while
int
if
If a binary search tree represents ‘n’ identifiers, then there will be exactly n internal
nodes and n+1 external nodes. Every internal node represents a point where a successful
search may terminate. Every external node represents a point where an unsuccessful search
may terminate.
Now, if successful search terminates at level l, then expected cost contribution for the
internal node is p(i) * level(ai). If unsuccessful search terminates at a specified level then
expected cost contribution is q(i) * ( level (E i ) – 1).
Finally, cost of the binary search tree is
The aim of the problem statement is to construct the binary search tree with the above
cost is minimum.
ak
l r
Here,
Finally,
Initially, c(i, i) = 0 ;
r(i, i) = 0 ;
w(i, i) = q( i ) , 0 ≤ i ≤ n and
w(i, j) = p( j ) + q( j ) + w(i, j – 1).
Algorithm OBST ( p, q, n )
{
for i : = 0 to n-1 do
{
w[i, i] : = q[i];
c[i, i] : = 0.0;
r[i, i] : = 0;
w[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
c[i, i+1] : = q[i] + q[i+1] + p[i+1] ;
r[i, i+1] : = i+1;
}
w[n, n] : = q[n];
c[n, n] : = 0.0;
r[n, n] : = 0;
for m : = 2 to n do
{
for i : = 0 to n - m do
{
j : = i + m;
w[i, j] : = w[i, j-1] + p[ j ] + q[ j ];
k : = Find(c, r, i, j);
c[i, j] = w[i, j] + c[i, k – 1] + c[k, j];
r[i, j] : = k;
}
}
write ( c[0, n], w[0, n], r[0, n] );
}
0 / 1 KNAPSACK PROBLEM
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:
maximize ∑ pi xi → 1
1≤i≤n
subject to ∑ wi xi ≤ m → 2
1≤i≤n
and xi = 0 or 1 , 1≤i≤n → 3
Let us assume that decision of xi is made in the order xn , xn-1 , - - - - , x1. For a
decision on xn possible states are:
Now, the remaining decisions xn-1 , - - - - , x1 must be optimal with respect to the
problem statement resulting from the decision on x n.
To solve this problem using dynamic programming, follow the procedure as:
S1i-1 = { Select next pair (P, W) and add with the pairs S i-1 } i = 1, 2, - - -.
Now, Si is obtained by merging Si-1 and S1i-1 and apply purging rule.
Purging Rule: If Si contains (Pj , Wj) and (Pk , Wk) with the condition that Wj > Wk
and Pj ≤ Pk , then the pair (Pj , Wj) can be discarded.
Step 3: Select a tuple (P1 , W1) with a condition that W1 ≤ m from the last set Si then
Now, select new pair as (P1 – pn , W1 – wn) and repeat Step 3 until all
decisions are completed.
Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (1, 2, 5)
and (w1 , w2 , w3) = (2, 3, 4).
Algorithm DKP ( p, w, n, m)
{
S0 : = { (0, 0) };
for i : = 1 to n – 1 do
{
S1i-1 = { (P, W) / (P – pi , W - wi ) є Si-1 and W ≤ m } ;
Si : = MergePurge( Si-1 , S1i-1 );
}
(PX, WX) : = last pair in Sn-1;
(PY, WY) : = (P1 + pn , W1 + wn);
The time complexity of 0/1 Knapsack using dynamic programming is O(2 n) time.
STRING EDITING
Consider two strings X = x1, x2, - - -, xn and Y = y1 , y2 , - - - - ym. Where xi , 1 ≤ i ≤ n
and yj, 1 ≤ j ≤ m, are members of a finite set of symbols known as the alphabet. We want to
transform X into Y using a sequence of edit operations on X. The permissible edit operations
are insert, delete, and change, each operation is associated with a cost. The cost of sequence
of operations is the sum of the costs of individual operations in the sequence.
For this,
Define cost(i, j) be the minimum cost of any edit sequence for transforming X into Y.
cost(i, j) can arrive using the recurrence equation:
Where,
cost1 (i, j) = min{ cost(i-1, j) + D(xi) , cost(i, j-1) + I(yj) , cost(i-1, j-1)+c(xi , yj) }
Compute cost(i, j) for all possible values of i and j. There are (n+1)(m+1) values.
These values can be computed in the form of a table M, where each row corresponds
to the value i and column corresponds to j.
A tour for the graph should be visit all vertices exactly once and the cost of the tour is
sum of the cost of the edges. Traveling salesperson problem is to find such a tour
with minimum cost.
Step 1: Let the tour starts and ends at the vertex 1. Then the function g( 1, V-{1} ) be
the total length of the tour.
Let g(i, S) be the length of the shortest path starting at vertex i, going
through all vertices in S, and terminating at vertex 1.
Initially, g ( i, 0 ) = ci 1 , 1 ≤ i ≤ n.
END
UNIT – IV
BACKTRACKING
Backtracking is one of the general technique used to obtain solution of the problem
statement which satisfies some constraints. For this, enumerate all possible solutions and
select which produces optimum result.
Explicit constraints are rules that restrict each xi can take values from a given set Si.
Example: Si = { 0 , 1 }
=> xi = 0 or 1.
Implicit constraints are rules that determines which of the tuples in the solution
space satisfies the criterion function.
Basic Terminology:
Backtracking strategy determines the problem solution by systematic searching for the
solution using Tree structure.
Algorithm BackTracking ( n )
{
k : = 1;
while ( k ≠ 0 ) do
{
if (each x[k] є T ( x[1], x[2], - - -- , x[k-1]) and
Bk(x[1], - - , x[k] is True ) then
{
if ( x[1], - - - , x[k] ) is a path to answer node then
write x[1 : k];
k : = k + 1;
}
else
k : = k – 1;
}
}
APPLICATIONS
8-Queens problem
Sum of subsets problem
Graph coloring
0/1 Knapsack problem etc,
N-QUEENS PROBLEM
4 – QUEENS PROBLEM
Explicit constraints: Si = { 1, 2, 3, 4 }
Xi should select from Si
Implicit constraints: No two queens can be placed in the same row,
column or diagonal.
Suppose two queens are placed at positions ( i, j ) and ( k, l ). Then they are one same
diagonal only if
i–j=k–l or i+j=k+l
j–l=i–k or j–l=k–i
|j–l| = | i – k |;
8 – QUEENS PROBLEM
Explicit constraints: Si = { 1, 2, 3, 4, 5, 6, 7, 8 }
Xi should select from Si
Implicit constraints: No two queens can be placed in the same row,
column or diagonal.
N – QUEENS PROBLEM
Algorithm NQueens ( k, i )
{
for i : = 1 to n do
{
if Place ( k, i ) then
{
x[ k ] : = i;
if ( k = n ) then
write x [ 1 : n ];
else
NQueens( k+1 , n );
}
}
}
Algorithm Place ( k , i )
{
// Returns True if a queen can be placed in kth row and ith column.
for j : = 1 to k – 1 do
{
if ( ( x[ j ] = i ) or (Abs ( x[ j ] – i ) = Abs ( j – k ) ) ) then
return False;
}
return True;
}
Analysis:
Place ( k, i ) returns a Boolean value True if queen can be placed on ith column. For
this, it takes O(k – 1) time.
Overall time complexity of N-Queens problem is O(n2) time.
SUM OF SUBSETS
Consider ‘n’ distinct positive numbers given by the set w = ( w1 , w2 , - - - , wn ). The
aim of the problem is to find all combinations of these numbers whose sums are m.
In solution space tree, left subtree of root defines all subsets containing wi i.e., xi = 1.
Right subtree of the root defines all subsets not containing w i i.e., xi = 0.
Example: Solve sum of subsets problem when n = 4, w = (7, 11, 13, 24) and m = 31.
Algorithm SumOfSub ( s, k, r )
{
x [ k ] : = 1;
if ( s + w[k] = m ) then
write ( x [ 1 : k ] );
else if ( s + w [ k ] + w [ k + 1 ] ≤ m ) then
SumOfSub ( s + w[k], k+1, r-w[k] );
if ( ( s + r – w[k] ≥ m ) and ( s + w[k+1] ≤ m ) ) then
{
x [ k ] : = 0;
SumOfSub ( s , k+1, r-w[k] );
}
}
GRAPH COLORING
Let G be a graph and m be a given positive integer. Graph coloring is a problem of
coloring each vertex in such a way that no two adjacent nodes have the same color and only
m colors are used.
In general, the problem is treated as m-colorability decision problem and the integer
value is treated as chromatic number of the graph.
1
A A
B C 2 3
B C
D E 1 D E 2
F F
3
Suppose we represent a graph by its adjacency matrix G[1:n , 1:n] where G[i, j] = 1 if
(i, j) is an edge of G; otherwise, G[i, j] = 0;
The colors are represented by the integers 1, 2, - - -- , m and the solutions are given
by the n-tuple ( x1 , x2 - - - xn ) where xi is the color of node i.
Algorithm mColoring ( k )
{
// k is the index of the next vertex to color.
repeat
{
NextValue ( k ); // Assign to x[k] a legal color
if ( x[ k ] = 0 ) then
return;
if ( k = n ) then
write ( x [ 1 : n ] ) ;
else
mColoring ( k+1 ) ;
}
until(false);
}
Algorithm NextValue ( k )
{
repeat
{
x [ k ] : = ( x [k] + 1 ) mod (m + 1);
if ( x[ k ] = 0 ) then
return;
for j : =1 to n do
{
if ( ( G[k, j] ≠ 0 ) and ( x[k] = x[j] )) then
break;
}
if ( j = n + 1 ) then
return;
}
until (false);
}
Analysis:
Computing time for NextValue to determine the legal color is O(mn) time.
NextValue function is invoked in m-coloring function by n times.
Hence, time complexity of Graph coloring problem is O(n mn) time.
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
Formally, the problem statement can be stated as:
maximize ∑ pi xi
1≤i≤n
subject to ∑ wi xi ≤ m
1≤i≤n
and xi = 0 or 1 , 1≤i≤n
Example: Obtain 0/1 Knapsack solution for the instance n=3, m=6, (p 1 , p2 , p3) = (5, 3, 6)
and (w1 , w2 , w3) = (3, 2, 4).
b : = cp;
c : = cw;
for i : = k + 1 to n do
{
c : = c + w[ i ];
if ( c < m ) then
b : = b + p[ i ];
else
return b + ( 1 – (c – m )/w[ i ] ) * p[ i ];
}
return b;
}
Here, nodes explore it’s children node by using either BFS are D-Search.
In branch and bound technology, a BFS state space search will be called FIFO (First
In First Out) search as the list of live nodes in First-In-First-Out list.
A D-Search stat space search will be called LIFO (Last In First Out) search as the
list of live nodes in Last-In-First-Out list.
As similar to backtracking process, Bounding functions are used to kill live nodes
those are not generating answer state.
Using FIFO and LIFO branch and bound selection of a E-Node is a time consuming
process. To improve its performance, they use another method that involves cost of the node
referred as least cost search.
If x is an answer node, then c(x) is the cost of reaching x from the root of the state
space tree.
If x is not an answer node, then c(x) = α.
Algorithm LCSearach ( t )
{
E : = t;
reperat
{
for each child x of E do
{
if x is an answer node then
write the path from x to t and return;
Add(x);
x → parent : = E;
}
if there ar no more live nodes then
{
write ‘ No Answer Node ‘
return;
}
E : = Least( );
}
until (False);
}
BOUNDING
Bounding functions are used to avoid generation of subtrees that do not contain the
answer node. In bounding function, lower and upper bound values are generated at each
node.
Assume each node x has a cost c(x) associated with it and a minimum cost answer
node is to be found.
A cost function Ĉ(.) such that Ĉ(x) ≤ c(x) is used to provide lower bounds on
solution obtained from any node x.
An upper us an upper bound on the cost of minimum cost solution, then all live nodes
x with Ĉ(x) > upper can be killed.
Initially upper is set to α. Each time a new answer node is found, the value of upper
can be updated.
Note:
1) If the list of live nodes is implemented as a Queue with Least( ) and Add( ) functions,
then LC search is treated as “FIFO Branch and Bound”.
2) If the list of live nodes is implemented as a Stack with Least( ) and Add( ) functions,
then LC search is treated as “LIFO Branch and Bound”.
Consider there are n objects and a Knapsack or bag. Every object i has a weight w i
and the knapsack has a capacity m. If the object i is placed into the knapsack, then a profit of
pixi is earned where xi = 0 or 1. The objective is to fill the knapsack that maximizes the total
profit earned.
In branch and bound strategy, the problem statement can be changed into
minimization problem as:
minimize - ∑ pi xi
1≤i≤n
subject to ∑ wi xi ≤ m
1≤i≤n
and xi = 0 or 1 , 1≤i≤n
Actual profit
m−CurrentTotalWeight of remaining
Ĉ(x) = u( x ) - *
Actual Weight of Remaining Object object
Example: Obtain LCBB solution for the given knapsack instance n = 4, m = 15,
(p1 , p2, p3 , p4) = (10, 10, 12, 18) and ( w1 , w2, w3 , w4) = (2, 4, 6, 9).
TRAVELING SALESPERSON
Let G = (V, E) be a directed graph defining an instance of the traveling salesperson
problem. Let cij is the cost of edge <i, j> є E and cij = α if <i, j> є E. Assume every tour
starts and ends at vertex 1.
To use LCBB to search the traveling salesperson state space tree, define a cost
function c( . ) and other two functions Ĉ(.) and u( . ) such that Ĉ(r) ≤ c(r) ≤ u(r) for all
nodes r. The cost c( . ) is such that the solution node with least c( . ) corresponds to the
shortest tour in G.
To obtain traveling salesperson solution, apply branch and bound strategy as:
For this, choose minimum entry in Row i (column j) and substract it from all
entries in Row i (column j).
Step 2: The total amount subtracted from the columns and rows is a lower bound on the
length of a minimum-cost tour and can be used as Ĉ value for the root of the state
space tree.
Step 3: Obtain a reduced cost matrix for every node in the traveling salesperson state
space tree.
Let A be the reduced cost matrix for node R. Let S be a child of R such that the tree
edge (R, S) corresponds to including edge <i, j> in the tour. If S is not a leaf node, then
reduced cost matrix for S may be obtained as follows:
α 20 30 10 11
15 α 16 4 2
3 5 α 2 4
19 6 18 α 3
16 4 7 16 α
END
UNIT – V
A set of decision problems that can be solved in deterministic polynomial time are
called P-Class problems.
Example: Searching : O(log n)
Sorting techniques : O(n2), O(n log n)
Matrix multiplication : O(n3) etc,
NP-Class problems are further classified into two types as: NP-Hard and NP-
Complete problems.
NP-Hard Problem
NP-Complete Problem
A problem L is NP-Complete if
It belongs to NP and NP-Hard
It has property that it can be solved in polynomial time, iff all other NP-Complete
problems can also be solved in polynomial time.
All NP-Complete problems are NP-Hard, but some NP-Hard problems can’t be NP-
Complete.
Properties:
P and NP: P is the set of all decision problems solvable by deterministic algorithms in
polynomial time. NP is the set of all decision problems solvable by nondeterministic
algorithms in polynomial time.
Cook’s Theorem
Cook’s theorem states that the satisfiability is in P, iff P = NP. The scientist Stephen
Cook in 1972 state that Boolean satisfiability problem is NP-Complete problem.
Let x1 = 1 , x2 = 0, x3 = 0
The non-deterministic algorithm can also be determine the value of expression for
corresponding assignment and accept if entire expression is true.
The proof shows that Boolean satisfiable problem can be solved in polynomial time.
Hence, all the problems in NP could be solved in polynomial time.
The Clique Decision Problem belongs to NP – If a problem belongs to the NP class, then
it should have polynomial-time verifiability, that is given a certificate, we should be able to
verify in polynomial time if it is a solution to the problem.
Proof:
1. Certificate – Let the certificate be a set S consisting of nodes in the clique and S is a
subgraph of G.
2. Verification – We have to check if there exists a clique of size k in the graph. Hence,
verifying if number of nodes in S equals k, takes O(1) time. Verifying whether each
vertex has an out-degree of (k-1) takes O(k2) time. (Since in a complete graph, each
vertex is connected to every other vertex through an edge. Hence the total number of
edges in a complete graph = kC2 = k*(k-1)/2 ). Therefore, to check if the graph formed
by the k nodes in S is complete or not, it takes O(k 2) = O(n2 ) time (since k<=n, where n
is number of vertices in G).
Therefore, the Clique Decision Problem has polynomial time verifiability and hence
belongs to the NP Class.
Conclusion
The Clique Decision Problem is NP and NP-Hard. Therefore, the Clique decision problem
is NP-Complete.
The traveling salesman problem consists of a salesman and a set of cities. The
salesman has to visit each one of the cities starting from a certain one and returning to the
same city. The challenge of the problem is that the traveling salesman wants to minimize the
total length of the trip.
THE END
II/IV B.TECH, I SEM, 2024-25, PBR VITS (Autonomous]: Kavali Page 80