Unit 2 - Analysis Design of Algorithm - WWW - Rgpvnotes.in
Unit 2 - Analysis Design of Algorithm - WWW - Rgpvnotes.in
Tech
Subject Name: Analysis and Design of Algorithm
Subject Code: CS-402
Semester: 4th
Downloaded from be.rgpvnotes.in
Unit-2 Notes
Study of Greedy strategy, examples of greedy method like optimal merge patterns, Huffman
coding, minimum spanning trees, knapsack problem, job sequencing with deadlines, single
source shortest path algorithm
Greedy Technique
Greedy is the most straight forward design technique. Most of the problems have n inputs and
require us to obtain a subset that satisfies some constraints. Any subset that satisfies these
constraints is called a feasible solution. We need to find a feasible solution that either maximizes
or minimizes the objective function. A feasible solution that does this is called an optimal
solution.
The greedy method is a simple strategy of progressively building up a solution, one element at a
time, by choosing the best possible element at each stage. At each stage, a decision is made
regarding whether or not a particular input is in an optimal solution. This is done by considering
the inputs in an order determined by some selection procedure. If the inclusion of the next
input, into the partially constructed optimal solution will result in an infeasible solution then this
input is not added to the partial solution. The selection procedure itself is based on some
optimization measure. Several optimization measures are plausible for a given problem. Most of
them, however, will result in algorithms that generate sub-optimal solutions. This version of
greedy technique is called subset paradigm. Some problems like Knapsack, Job sequencing with
deadlines and minimum cost spanning trees are based on subset paradigm.
struct treenode
{
treenode * lchild;
treenode * rchild;
};
Analysis:
T= O (n-1) * max (O (Least), O (Insert)).
- Case 2: L is sorted.
Case 2.1
O (Least)= O (1)
O (Insert)= O (n)
T= O (n2)
Case 2.2
L is represented as a min-heap. Value in the root is <= the values of its children.
O (Least)= O (1)
O (Insert)= O (log n)
T= O (n log n).
Huffman Codes
Huffman coding is a lossless data compression algorithm. The idea is to assign variable-length codes to input
characters, lengths of the assigned codes are based on the frequencies of corresponding characters. The most
frequent character gets the smallest code and the least frequent character gets the largest code.
The variable-length codes assigned to input characters are Prefix Codes, means the codes (bit sequences) are
assigned in such a way that the code assigned to one character is not prefix of code assigned to any other
character. This is how Huffman Coding makes sure that there is no ambiguity when decoding the generated bit
stream.
Let us understand prefix codes with a counter example. Let there be four characters a, b, c and d, and their
corresponding variable length codes be 00, 01, 0 and 1. This coding leads to ambiguity because code assigned
to c is prefix of codes assigned to a and b. If the compressed bit stream is 0001, the de-compressed output
a e d o o a d o a figu e . .
Steps to build Huffman code
Input is array of unique characters along with their frequency of occurrences and output is Huffman Tree.
1. Create a leaf node for each unique character and build a min heap of all leaf nodes (Min Heap is used as a
priority queue. The value of frequency field is used to compare two nodes in min heap. Initially, the least
frequent character is at root)
2. Extract two nodes with the minimum frequency from the min heap.
3. Create a new internal node with frequency equal to the sum of the two nodes frequencies. Make the first
extracted node as its left child and the other extracted node as its right child. Add this node to the min heap.
4. Repeat steps#2 and #3 until the heap contains only one node. The remaining node is the root node and the
tree is complete.
Example:
Letter A B C D E F
Frequency 10 20 30 40 50 60
V 210
0 1
Z 90 W 120
0 1 0 1
D 40 E 50 Y 60 F 60
0 1
X 30 C 30
0 1
A 10 B 20
Page no: 3 Follow us on facebook to get real-time updates from RGPV
Downloaded from be.rgpvnotes.in
Algorithm:
Huffman(A)
{
n = |A|;
Q = A;
for i = 1 to n-1
{
z = new node;
left[z] =Extract-Min(Q);
right[z] =Extract-Min(Q);
f[z] = f[left[z]] +f[right[z]];
Insert(Q, z);
}
return Extract-Min(Q);
}
Analysis of algorithm
Each priority queue operation (e.g. heap): O(log n)
In each iteration: one less subtree.
Initially: n subtrees.
Total: O(n log n) time.
Kruskal’s Algorith
This is a greedy algorithm. A greedy algorithm chooses some local optimum (i.e. picking an edge with the least
weight in a MST).
Kruskal's algorithm works as follows: Take a graph with 'n' vertices, keep on adding the shortest (least cost)
edge, while avoiding the creation of cycles, until (n - 1) edges have been added. Sometimes two or more edges
may have the same cost. The order in which the edges are chosen, in this case, does not matter. Different
MSTs may result, but they will all have the same total cost, which will always be the minimum cost.
t [i, 1] := u; t [i, 2] := v; mincost :=mincost + cost [u, v]; Union (j, k);
}
}
if (i >n-1) then write ("no spanning tree");
else return mincost;
}
Running time:
• The number of finds is at most 2e, and the number of unions at most n-1. Including the initialization
time for the trees, this part of the algorithm has a complexity that is just slightly more than O (n + e).
• We can add at most n-1 edges to tree T. So, the total time for operations on T is O(n).
Summing up the various components of the computing times, we get O (n + e log e) as asymptotic complexity.
This simple modified algorithm of spanning tree is called prim's algorithm for finding an
Minimal cost spanning tree.
Prim's algorithm is an example of a greedy algorithm.
return mincost;
}
Running time:
We do the same set of operations with dist as in Dijkstra's algorithm (initialize structure, m times
decrease value, n - 1 times select minimum). Therefore, we get O (n2) time when we implement dist with
array, O (n + E.log n) when we implement it with a heap.
an edge (u, v) of minimum weight vertex (say, v) to find MCST.
to find MCST. In Prim s algorithm for getting
In kruskal s algorithm for getting MCST, it is necessary to select
MCST, it is not necessary to an adjacent vertex of already
choose adjacent vertices of selected vertices (in any
already selected vertices (in any successive steps).
successive steps). At intermediate step of
At intermediate step of algorithm, there will be only
algorithm, there are may be one connected components
more than one connected
are possible
components are possible. Time complexity: O (V2)
Time complexity: O (|E| log |V|)
KNAPSACK PROBLEM
Let us appl the g eed ethod to sol e the k apsa k p o le . We a e gi e o je ts a d a k apsa k. The
o je t i has a eight i a d the k apsa k has a apa it . If a f a tio i, < i < of object i is placed into
the knapsack then a profit of pi xi is earned. The objective is to fill the knapsack that maximizes the total profit
earned.
“i e the k apsa k apa it is , e e ui e the total eight of all hose o je ts to e at ost .
Algorithm
If the objects are already been sorted into non-increasing order of p[i] / w[i] then the algorithm given below
obtains solutions corresponding to this strategy.
Greedy Fractional-Knapsack (P[1..n], W[1..n], X [1..n], M)
/* P[1..n] and W[1..n] contains the profit and weight of the n-objects ordered
such that
X[1..n] is a solution set and M is the capacity of KnapSack*/
{
1: For i ← to n do
2: X[i] ←
3: profit ← //Total profit of item filled in Knapsack
4: weight ← // Total weight of items packed in KnapSack
5: i←
6: While (Weight < M) // M is the Knapsack Capacity
{
7: if (weight + W[i] ≤ M)
8: X[i] = 1
9: weight = weight + W[i]
10: else
Running time:
The objects are to be sorted into non-decreasing order of pi / wi ratio. But if we disregard the time to initially
sort the objects, the algorithm requires O(nlogn) time.
We still have to discuss the running time of the algorithm. The initial sorting can be done in time O(n log n),
and the rest loop takes time O(n). It is not hard to implement each body of the second loop in time O(n), so
the total loop takes time O(n2). So the total algorithm runs in time O(n2). Using a more sophisticated data
structure one can reduce this running time to O(n log n), but in any case it is a polynomial-time algorithm.
for i :=1 to n do
{
S [i] := false; // Initialize S. dist [i] :=cost [v, i];
}
S[v] := true; dist[v] := 0.0; // Put v in S. for num := 2 to n – 1 do
{
Determine n - 1 paths from v.
Choose u from among those vertices not in S such that dist[u] is minimum; S[u] := true; // Put u is S.
for (each w adjacent to u with S [w] = false)
do
if (dist [w] > (dist [u] + cost [u, w]) then // Update distances dist [w] := dist [u] + cost [u, w];
}
}
Running time:
For heap A = O (n); B = O (log n); C = O (log n) which gives O (n + m log n) total.