AOA NOTES UNIT -2
AOA NOTES UNIT -2
A. GREEDY METHOD
1. FRACTIONAL KNAPSACK
2. JOB SEQUENCING
3. OPTIMAL MERGE PATTERNS
4. MINIMUM SPANNING TREE – PRIMS & KRUSHKAL
1. FRACTIONAL KNAPSACK
Fractional knapsack problem is solved using greedy method in the following steps-
Step-01:
Step-02:
Arrange all the items in decreasing order of their value / weight ratio.
Step-03:
Start putting the items into the knapsack beginning from the item with the highest ratio.
Put as many items as you can into the knapsack.
Time Complexity-
The main time taking step is the sorting of all items in decreasing order of their value / weight ratio.
If the items are already arranged in the required order, then while loop takes O(n) time.
The average time complexity of Quick Sort is O(nlogn).
Therefore, total time taken including the sort is O(nlogn).
Problem-
A thief enters a house for robbing it. He can carry a maximal weight of 60 kg into his bag. There are 5
items in the house with the following weights and values. What items should thief take if he can even take
the fraction of any item with him?
1 5 30
2 10 40
3 15 45
4 22 77
5 25 90
Solution-
Step-01:
1 5 30 6
2 10 40 4
3 15 45 3
4 22 77 3.5
5 25 90 3.6
Step-02:
Sort all the items in decreasing order of their value / weight ratio-
I1 I2 I5 I4 I3
(6) (4) (3.6) (3.5) (3)
Step-03:
Start filling the knapsack by putting the items into it one by one.
Knapsack Items
Cost
Weight in Knapsack
60 Ø 0
55 I1 30
45 I1, I2 70
Now,
Knapsack weight left to be filled is 20 kg but item-4 has a weight of 22 kg.
Since in fractional knapsack problem, even the fraction of any item can be taken.
So, knapsack will contain the following items-
< I1 , I2 , I5 , (20/22) I4 >
Important Note-
Had the problem been a 0/1 knapsack problem, knapsack would contain the following items-
< I1 , I2 , I5 >
The knapsack’s total cost would be 160 units.
2. JOB SEQUENCING WITH DEADLINES
The sequencing of jobs on a single processor with deadline constraints is called as Job Sequencing with
Deadlines.
Here-
You are given a set of jobs.
Each job has a defined deadline and some profit associated with it.
The profit of a job is given only when that job is completed within its deadline.
Only one processor is available for processing all the jobs.
Processor takes one unit of time to complete a job.
Approach to Solution-
A feasible solution would be a subset of jobs where each job of the subset gets completed within its
deadline.
Value of the feasible solution would be the sum of profit of all the jobs contained in the subset.
An optimal solution of the problem would be a feasible solution which gives the maximum profit.
Step-01:
Step-02:
Step-03:
Problem-
Jobs J1 J2 J3 J4 J5 J6
Deadlines 5 3 3 2 4 2
Profits 200 180 190 300 120 100
Solution-
Step-01:
Jobs J4 J1 J3 J2 J5 J6
Deadlines 2 5 3 3 4 2
Profits 300 200 190 180 120 100
Step-02:
We take each job one by one in the order they appear in Step-01.
We place the job on Gantt chart as far as possible from 0.
Step-03:
Step-04:
Step-05:
We take job J3.
Since its deadline is 3, so we place it in the first empty cell before deadline 3 as-
Step-06:
Now,
Part-01:
J2 , J4 , J3 , J5 , J1
This is the required order in which the jobs must be completed in order to obtain the maximum profit.
Part-02:
Part-03:
= 990 units
Optimal merge pattern is a pattern that relates to the merging of two or more
sorted files in a single sorted file. This type of merging can be done by the two-way
merging method.
If we have two sorted files containing n and m records respectively then they could
be merged together, to obtain one sorted file in time O (n+m).
There are many ways in which pairwise merge can be done to get a single sorted
file. Different pairings require a different amount of computing time. The main
thing is to pairwise merge the n sorted files so that the number of comparisons will
be less.
After this, pick two smallest numbers and repeat this until we left with only one
number.
Step 1: Insert 2, 3
Step 2:
Step 3: Insert 5
Step 4: Insert 13
Step 5: Insert 7 and 9
Step 6:
So, The merging cost = 5 + 10 + 16 + 23 + 39 = 93
4. PRIMS ALGORITHM
It is used for finding the Minimum Spanning Tree (MST) of a given graph.
To apply Prim’s algorithm, the given graph must be weighted, connected and undirected.
Step-01:
The vertex connecting to the edge having least weight is usually selected.
Step-02:
Find all the edges that connect the tree to new vertices.
Find the least weight edge among those edges and include it in the
existing tree.
If including that edge creates a cycle, then reject that edge and look for the
next least weight edge.
Step-03:
Keep repeating step-02 until all the vertices are included and Minimum
Spanning Tree (MST) is obtained.
If adjacency list is used to represent the graph, then using breadth first
search, all the vertices can be traversed in O(V + E) time.
We traverse all the vertices of graph using breadth first search and use a min
heap for storing the vertices not yet included in the MST.
To get the minimum weight edge, we use min heap as a priority queue.
Min heap operations like extracting minimum element and decreasing key
value takes O(logV) time.
= O(E + V) x O(logV)
= O((E + V)logV)
= O(ElogV)
This time complexity can be improved and reduced to O(E + VlogV) using
Fibonacci heap.
Problem-01:
Construct the minimum spanning tree (MST) for the given graph using Prim’s
Algorithm-
Solution-
The above discussed steps are followed to find the minimum cost spanning
tree using Prim’s Algorithm-
Step-01:
Step-02:
Step-03:
Step-04:
Step-05:
Step-06:
Since all the vertices have been included in the MST, so we stop.
5.KRUSHKAL ALGORITHM
Step-01:
Step-02:
Take the edge with the lowest weight and use it to connect the vertices of
graph.
If adding an edge creates a cycle, then reject that edge and go for the next
least weight edge.
Step-03:
Keep adding edges until all the vertices are connected and a Minimum
Spanning Tree (MST) is obtained.
Connect these vertices using edges with minimum weights such that
no cycle gets formed.
= O(ElogV) or O(ElogE)
Analysis-
The next edge can be obtained in O(logE) time if graph has E edges.
Special Case-
If the edges are already sorted, then there is no need to construct min heap.
Problem-01:
Construct the minimum spanning tree (MST) for the given graph using Kruskal’s Algorithm-
Solution-
Connect these vertices using edges with minimum weights such that no cycle
gets formed.
Since all the vertices have been connected / included in the MST, so we stop.
Weight of the MST
= Sum of all edge weights
= 10 + 25 + 22 + 12 + 16 + 14
= 99 units
The tree that we are making or growing The tree that we are making or growing
always remains connected. usually remains disconnected.
Prim’s Algorithm grows a solution from a Kruskal’s Algorithm grows a solution from
random vertex by adding the next the cheapest edge by adding the next
cheapest vertex to the existing tree. cheapest edge to the existing tree / forest.
Prim’s Algorithm is faster for dense Kruskal’s Algorithm is faster for sparse
graphs. graphs
A. DYNAMIC PROGRAMMING
1. 0/1 KNAPSACK
2. LONGEST COMMON SUBSEQUENCE
3. MATRIX CHAIN MULTIPLICATION
Dynamic Programming is an algorithmic paradigm that solves a given complex problem by breaking
it into subproblems and stores the results of subproblems to avoid computing the same results again.
Dynamic programming is used when the sub problems are not independent. Dynamic Programming is a
bottom – up approach – we solve all possible small problems and then combine them to obtain solutions for
bigger problems.
Dynamic Programming is often used in optimization problems (A problem with many possible
solutions for which we want to find an optimal solution)
Dynamic Programming works when a problem has the following two main properties.
Overlapping Subproblems
Optimal Substructure
Overlapping Subproblems: When a recursive algorithm would visit the same subproblems repeatedly,
then a problem has overlapping subproblems.
Like Divide and Conquer, Dynamic Programming combines solutions to sub-problems. Dynamic Programming
is mainly used when solutions of same subproblems are needed again and again. In dynamic programming,
computed solutions to subproblems are stored in a table so that these don’t have to recomputed. So
Dynamic Programming is not useful when there are no common (overlapping) subproblems because there is
no point in storing the solutions if they are not needed again.
Optimal Substructure: A given problems has Optimal Substructure Property if optimal solution of the given
problem can be obtained by using optimal solutions of its subproblems.
To use dynamic programming the problem must observe the principle of optimality, that whatever the initial
state is, remaining decisions must be optimal with regard the state following from the first decision.
0/1 knapsack problem is solved using dynamic programming in the following steps-
Step-01:
Draw a table say ‘T’ with (n+1) number of rows and (w+1) number of columns.
Fill all the boxes of 0th row and 0th column with zeroes as shown-
Step-02:
Start filling the table row wise top to bottom from left to right.
Here, T(i , j) = maximum value of the selected items if we can take items 1 to i and have weight restrictions of j.
Step-03:
To identify the items that must be put into the knapsack to obtain that maximum profit,
Time Complexity-
Each entry of the table requires constant time θ(1) for its computation.
It takes θ(nw) time to fill (n+1)(w+1) table entries.
It takes θ(n) time for tracing the solution since tracing process traces the n rows.
Thus, overall θ(nw) time is taken to solve 0/1 knapsack problem using dynamic programming.
Problem-
For the given set of items and knapsack capacity = 5 kg, find the optimal solution for the 0/1 knapsack problem
making use of dynamic programming approach.
n=4
w = 5 kg
Solution-
Given-
Knapsack capacity (w) = 5 kg
Number of items (n) = 4
Step-01:
Draw a table say ‘T’ with (n+1) = 4 + 1 = 5 number of rows and (w+1) = 5 + 1 = 6 number of columns.
Fill all the boxes of 0th row and 0th column with 0.
Step-02:
Start filling the table row wise top to bottom from left to right using the formula-
We have,
i=1
j=1
(value)i = (value)1 = 3
(weight)i = (weight)1 = 2
T(1,1) = 0
Finding T(1,2)-
We have,
i=1
j=2
(value)i = (value)1 = 3
(weight)i = (weight)1 = 2
T(1,2) = 3
Finding T(1,3)-
We have,
i=1
j=3
(value)i = (value)1 = 3
(weight)i = (weight)1 = 2
T(1,3) = 3
Finding T(1,4)-
We have,
i=1
j=4
(value)i = (value)1 = 3
(weight)i = (weight)1 = 2
T(1,4) = 3
Finding T(1,5)-
We have,
i=1
j=5
(value)i = (value)1 = 3
(weight)i = (weight)1 = 2
T(1,5) = 3
Finding T(2,1)-
We have,
i=2
j=1
(value)i = (value)2 = 4
(weight)i = (weight)2 = 3
T(2,1) = 0
Finding T(2,2)-
We have,
i=2
j=2
(value)i = (value)2 = 4
(weight)i = (weight)2 = 3
T(2,2) = 3
Finding T(2,3)-
We have,
i=2
j=3
(value)i = (value)2 = 4
(weight)i = (weight)2 = 3
T(2,3) = 4
Finding T(2,4)-
We have,
i=2
j=4
(value)i = (value)2 = 4
(weight)i = (weight)2 = 3
T(2,4) = 4
Finding T(2,5)-
We have,
i=2
j=5
(value)i = (value)2 = 4
(weight)i = (weight)2 = 3
T(2,5) = 7
After all the entries are computed and filled in the table, we get the following table-
The last entry represents the maximum possible value that can be put into the knapsack.
So, maximum possible value that can be put into the knapsack = 7.
Following Step-04,
Here, Chain means one matrix's column is equal to the second matrix's row
[always].
There are p . r total entries in C and each takes O(q) time to compute, thus the total time to multiply these
two matrices is dominated by the number of scalar multiplication, which is p . q . r.
The time to compute C is dominated by the number of scalar multiplications which is pqr. We shall express costs
of multiplying two matrices in terms of the number of scalar multiplications.
Matrix multiplication is an associative operation, but not a commutative operation. By this, we mean that we
have to follow the above matrix order for multiplication, but we are free to parenthesize the above
multiplication depending upon our need.
For example,
To illustrate the different costs incurred by different parenthesizations of a matrix product, consider the
problem of a chain <A1, A2, A3> of three matrices. Suppose that the dimensions of the matrices are 10 × 100, 100 ×
5, and 5 × 50, respectively. The possible order of multiplication are
Hence, to compute the product ((A1 A2)A3), a total of 7500 scalar multiplications.
multiplications. Thus, computing the product according to the first parenthesization is 10 times
faster.
The matrix-chain multiplication problem can be stated as follows: Given a sequence of n matrices
A1, A2, ... An, and their dimensions p0, p1, p2, ..., pn, where where i = 1, 2, ..., n, matrix Ai has dimension
pi − 1 × pi, determine the order of multiplication that minimizes the the number of scalar multiplications.
Note that in the matrix-chain multiplication problem, we are not actually multiplying matrices. Our goal is
only to determine an order for multiplying matrices that has the lowest cost.
To help us keep track of solutions to subproblems, we will use a table, and build the table in a
bottomup manner. For 1 ≤ i ≤ j ≤ n, let m[i, j] be the minimum number of scalar multiplications needed to
compute the Ai..j. The optimum cost can be described by the following recursive formulation.
To keep track of optimal subsolutions, we store the value of k in a table s[i, j]. Recall, k is the
place at which we split the product Ai..j to get an optimal parenthesization.
Example problem:
3. LONGEST COMMON SUBSEQUENCE
Given two strings, S1 and S2, the task is to find the length of the Longest
Common Subsequence, i.e. longest subsequence present in both of the
strings.
Output: 2
Create a 2D array dp[][] with rows and columns equal to the length of
each input string plus 1 [the number of rows indicates the indices
of S1 and the columns indicate the indices of S2].
Initialize the first row and column of the dp array to 0.
Iterate through the rows of the dp array, starting from 1 (say using
iterator i).
For each i, iterate all the columns from j = 1 to n:
If S1[i-1] is equal to S2[j-1], set the current element of
the dp array to the value of the element to (dp[i-1][j-1]
+ 1).
Else, set the current element of the dp array to the
maximum value of dp[i-1][j] and dp[i][j-1].
After the nested loops, the last element of the dp array will contain the
length of the LCS.