0% found this document useful (0 votes)
193 views

Bcs401 Module 4 Ada Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
193 views

Bcs401 Module 4 Ada Notes

Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Module 4
DYNAMIC PROGRAMMING: Three basic examples, The
Knapsack Problem and Memory Functions, Warshall’s and
Floyd’s Algorithms.
Module 4 Syllabus
THE GREEDY METHOD: Prim’s Algorithm, Kruskal’s
Algorithm, Dijkstra’s Algorithm, Huffman Trees and Codes.

LECTURE 25:

DYNAMIC PROGRAMMING:

Dynamic programming is a technique for solving problems with overlapping


subproblems. Typically, these subproblems arise from a recurrence relating a given problem’s
solution to solutions of its smaller subproblems. Rather than solving overlapping subproblems
again and again, dynamic programming suggests solving each of the smaller subproblems only
once and recording the results in a table from which a solution to the original problem can then
be obtained.
This technique can be illustrated by revisiting the Fibonacci numbers. The Fibonacci numbers are
the elements of the sequence
0, 1, 1, 2, 3, 5, 8, 13, 21, 34, . . . ,
which can be defined by the simple recurrence
F(n) = F(n − 1) + F(n − 2) for n > 1

Since a majority of dynamic programming applications deal with optimization problems, we also
need to mention a general principle that underlines such applications. Richard Bellman called it
the principle of optimality.
Page 1
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

4.1. THREE BASIC EXAMPLES:


4.1.1. EXAMPLE 1 :Coin-row problem
Problem Statement: There is a row of n coins whose values are some positive integers c1, c2, . . . , c
n , not necessarily distinct. The goal is to pick up the maximum amount of money subject to the
constraint that no two coins adjacent in the initial row can be picked up.

Let F (n) be the maximum amount that can be picked up from the row of n coins. To derive a recurrence
for F (n), we partition all the allowed coin selections into two groups: those that include the last coin
and those without it. The largest amount we can get from the first group is equal to c n + F (n − 2)—
the value of the nth coin plus the maximum amount we can pick up from the first n − 2 coins. The
maximum amount we can get from the second group is equal to F (n − 1) by the definition of F (n).

Thus, we have the following recurrence subject to the obvious initial conditions:
F (n) = max{cn + F (n − 2), F (n − 1)} for n > 1,
F (0) = 0, F (1) = c1.

Using the CoinRow to find F (n), the largest amount of money that can be picked up, as well as the
coins composing an optimal set, clearly takes Ɵ(n) time and Ɵ(n) space.
Problem 1: The application of the algorithm to the coin row of denominations 5, 1, 2, 10, 6, 2 is shown
below

Page 2
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

OR

Page 3
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Problem 2:

Problem 3:

Page 4
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
4.1.2. EXAMPLE 2: Change-making problem
Consider the general instance of the following well-known problem. Give change for amount n using
the minimum number of coins of denominations d1 < d2 < . . . < dm . For the coin denominations used
in the United States, as for those used in most if not all other countries. Here, we consider a dynamic
programming algorithm for the general case, assuming availability of unlimited quantities of coins for
each of the m denominations
d1 < d2 < . . . < dm where d1 = 1.
Let F (n) be the minimum number of coins whose values add up to n; it is convenient to define F (0) =
0. The amount n can only be obtained by adding one coin of denomination dj to the amount n − dj for
j = 1, 2, . . . , m such that n ≥ dj . Therefore, we can consider all such denominations and select the one
minimizing F(n − dj ) + 1. Since 1 is a constant, we can, of course, find the smallest F(n − dj ) first
and then add 1 to it. Hence, we have the following recurrence for F (n):

We can compute F (n) by filling a one-row table left to right in the manner similar to the way it was
done above for the coin-row problem, but computing a table entry here requires finding the minimum
of up to m numbers.

The application of the algorithm to amount n = 6 and denominations 1, 3, 4 is shown below.

Page 5
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
The answer it yields is two coins. The time and space efficiencies of the algorithm are obviously O(nm)
and Ɵ (n), respectively.

For the instance considered (for n = 6), the minimum was produced by d2 = 3. The second minimum
(for n = 6 − 3) was also produced for a coin of that denomination. Thus, the minimum-coin set for n =
6 is two 3’s.

Page 6
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
4.1.2. EXAMPLE 3: Coin-collecting problem:
Problem Statement: Several coins are placed in cells of an n x m board, no more than one coin per
cell. A robot, located in the upper left cell of the board, needs to collect as many of the coins as possible
and bring them to the bottom right cell. On each step, the robot can move either one cell to the right or
one cell down from its current location. When the robot visits a cell with a coin, it always picks up that
coin. Design an algorithm to find the maximum number of coins the robot can collect and a path it needs
to follow to do this.

Solution:
 Let F(i, j) be the largest number of coins the robot can collect and bring to the cell (i, j) in the
ith row and jth column of the board. It can reach this cell either from the adjacent cell (i-1, j)
above it or from the adjacent cell (i, j-1) to the left of it.
 The largest numbers of coins that can be brought to these cells are F(i1, j) and Fi, j-1)
respectively. Of course, there are no adjacent cells to the left of the first column and above the
first row. For such cells, we assume there are 0 neighbours.
 Hence, the largest number of coins the robot can bring to cell (i, j) is the maximum of the two
numbers F(i-1, j) and F(i, j-1), plus the one possible coin at cell (i, j) itself.
Recurrence:

where cij = 1 if there is a coin in cell (i, j) and cij = 0 otherwise.

Page 7
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Fig 4.1.2 a) Coins to collect. (b) Dynamic programming algorithm results. (c) Two
paths to collect 5 coins, the maximum number of coins possible.

REVIEW QUESTION ON .4.1


1. What does dynamic programming have in common with divide-and-conquer?
2. What is a principal difference between dynamic programming and divide-and-conquer ?
3. What is the time efficiency of solving the coin-row problem by straight- forward application of
recurrence.
4. Apply the dynamic programming algorithm to find all the solutions to the change-making
problem for the denominations 1, 3, 5 and the amount n = 9.

Page 8
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
LECTURE 26:

4.2: The Knapsack Problem and Memory Functions

4.2.1. Knapsack problem: given n items of known weights w1, . . . , wn and values v1, . . . , vn
and a knapsack of capacity W, find the most valuable subset of the items that fit into the
knapsack.

To design a dynamic programming algorithm, we need to derive a recurrence relation that


expresses a solution to an instance of the knapsack problem in terms of solutions to its smaller
subinstances. Let us consider an instance defined by the first i items, 1≤ i ≤ n, with weights w1, .. . , wi,
values v1, . . . , vi , and knapsack capacity j, 1 ≤ j ≤ W. Let F(i, j) be the value of an optimal solution to
this instance, i.e., the value of the most valuable subset of the first i items that fit into the knapsack of
capacity j.We can divide all the subsets of the first i items that fit the knapsack of capacity j into two
categories: those that do not include the ith item and those that do.
Note the following:
 Among the subsets that do not include the ith item, the value of an optimal subset is, by
definition, F(i − 1, j).
 Among the subsets that do include the ith item (hence, j – wi ≥ 0), an optimal subset is made
up of this item and an optimal subset of the first i − 1 items that fits into the knapsack of
capacity j − wi . The value of such an optimal subset is vi + F(i − 1, j − wi).

It is convenient to define the initial conditions as follows:

Our goal is to find F(n, W), the maximal value of a subset of the n given items that fit into the
knapsack of capacity W, and an optimal subset itself.

Page 9
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Thus, the maximal value is F(4, 5) = $37. We can find the composition of an optimal subset by
backtracing the computations of this entry in the table. Since F(4, 5) > F(3, 5), item 4 has to be included
in an optimal solution along with an optimal subset for filling 5 − 2 = 3 remaining units of the knapsack
capacity. The value of the latter is F(3, 3). Since F(3, 3) = F(2, 3), item 3 need not be in an optimal
subset. Since F(2, 3) > F(1, 3), item 2 is a part of an optimal selection, which leaves element F(1, 3 −
1) to specify its remaining composition. Similarly, since F(1, 2) > F(0, 2), item 1 is the final part of the
optimal solution
{item 1, item 2, item 4}.

The time efficiency and space efficiency of this algorithm are both in Θ(nW).

4.2.2. Memory Functions


Dynamic programming deals with problems whose solutions satisfy a recurrence relation with
overlapping subproblems. The direct top-down approach to finding a solution to such a recurrence leads
to an algorithm that solves common subproblems more than once and hence is very inefficient. The
classic dynamic programming approach, on the other hand, works bottom up: it fills a table with solutions
to all smaller subproblems, but each of them is solved only once. An unsatisfying aspect of this approach
is that solutions to some of these smaller subproblems are often not necessary for getting a solution to
the problem given. Since this drawback is not present in the top-down approach, it is natural to try to
combine the strengths of the top-down and bottom-up approaches. The goal is to get a method that
solves only subproblems that are necessary and does so only once. Such a method exists; it is based on
using memory functions. This method solves a given problem in the top-down manner but, in addition,
maintains a table

Page 10
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Only 11 out of 20 nontrivial values (i.e., not those in row 0 or in column 0) have been computed. Just one
nontrivial entry, V (1, 2), is retrieved rather than being recomputed.

REVIEW QUESTION ON 4.2

1. What is knapsack problem?


2. What is the significance of the subproblem in the dynamic programming solution for the Knapsack
problem
3. What are some real-world applications of the knapsack problem?

Page 11
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
LECTURE 27:

4.3: Warshall’s and Floyd’s Algorithms


Warshall’s and Floyd’s Algorithms: Warshall’s algorithm for computing the transitive closure (there
is a path between any two nodes) of a directed graph and Floyd’s algorithm for the all-pairs shortest-
paths problem. These algorithms are based on dynamic programming.

4.3.1. Warshall’s Algorithm (All-Pairs Path Existence Problem)


A directed graph (or digraph) is a graph, or set of vertices connected by edges, where the
edges have a direction associated with them.
An Adjacency matrix A = {aij} of a directed graph is the boolean matrix that has 1 in its ith
row and jth column if and only if there is a directed edge from the ith vertex to the jth vertex.
The transitive closure of a directed graph with n vertices can be defined as the n x n boolean
matrix T = {tij }, in which the element in the ith row and the jth column is 1 if there exists anontrivial
path (i.e., directed path of a positive length) from the ith vertex to the jth vertex; otherwise, tij is 0.

FIGURE 4.3.1 (a) Digraph. (b) Its adjacency matrix. (c) Its transitive closure.
The transitive closure of a digraph can be generated with the help of depth-first search or breadth-first
search. Every vertex as a starting point yields the transitive closure for all.
Warshall’s algorithm constructs the transitive closure through a series of n × n boolean matrices: R(0), .
. . , R(k−1), R(k), . . . R(n) .
The element rij(k) in the ith row and jth column of matrix R(k) (i, j = 1, 2, . . . , n, k = 0, 1, . . . , n) is
equal to 1 if and only if there exists a directed path of a positive length from the ith vertex to the jth
vertex with each intermediate vertex, if any, numbered not higher than k.

Page 12
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Steps to compute R(0), . . . , R(k−1), R(k), . . . R(n).
 The series starts with R(0) , which does not allow any intermediate vertices in its paths; hence,
R(0) is nothing other than the adjacency matrix of the digraph.
 R(1) contains the information about paths that can use the first vertex as intermediate. it may
contain more 1’s than R(0).
 The last matrix in the series, R(n) , reflects paths that can use all n vertices of the digraph as
intermediate and hence is nothing other than the digraph’s transitive closure.
 In general, each subsequent matrix in series has one more vertex to use as intermediate for its
paths than its predecessor.
 The last matrix in the series, R(n) , reflects paths that can use all n vertices of the digraph as
intermediate and hence is nothing other than the digraph’s transitive closure.

FIGURE 4.3.1(d) Rule for changing zeros in Warshall’s algorithm.

All the elements of each matrix R(k) is computed from its immediate predecessor R(k−1). Letr rij(k), the
element in the ith row and jth column of matrix R(k), be equal to 1. This means that there exists a path
from the ith vertex vi to the jth vertex vj with each intermediate vertex numbered not higher than k.
The first part of this representation means that there exists a path
ik from vi to vk with each intermediate

vertex numbered not higher than k − 1 (hence, rik(k−1) = 1), and the second part means that there exists
a path from vk to vj with each intermediate vertex numbered not higher than k − 1(hence, rkj(k−1) = 1).
Thus the following formula generating the elements of matrix R(k) from the elements of matrix R(k−1):

Page 13
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Warshall’s Algorithm (pseudocode and analysis):

Page 14
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Application of Warshall’s algorithm to the digraph shown:

Note: New 1’s are in bold

REVIEW QUESTION ON .4.3.1


1. How to find transitive closure using Warshall algorithm?
2. Apply Warshall’s algorithm to find the transitive closure of the digraph defined by the following
adjacency matrix:
0100
0010
0001
0000
3. What is the time complexity of Warshall?
4. What are the advantages of the Warshall algorithm?

Page 15
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

LECTURE 28:

4.3.2. Floyd’s Algorithms:

Given a weighted connected graph (undirected or directed), the all-pairs shortest- paths problem asks
to find the distances—i.e., the lengths of the shortest paths— from each vertex to all other vertices. This
is one of several variations of the problem involving shortest paths in graphs.
It is convenient to record the lengths of shortest paths in an n × n matrix D called the distance matrix:
the element dij in the ith row and the jth column of this matrix indicates the length of the shortest path
from the ith vertex to the jth vertex.
For an example,

We can generate the distance matrix with an algorithm that is very similar to Warshall’s algorithm is
called Floyd’s algorithm.
Floyd’s algorithm computes the distance matrix of a weighted graph with n vertices througha series
of n × n matrices:
D(0), . . . , D(k−1), D(k), . . . , D(n)
The element dij(k) in the ith row and the jth column of matrix D(k) (i, j = 1, 2, . . . , n, k = 0, 1,. . . , n) is
equal to the length of the shortest path among all paths from the ith vertex to the jth vertex with each
intermediate vertex, if any, numbered not higher than k.

Steps to compute D(0), . . . , D(k−1), D(k), . . . , D(n)


 The series starts with D(0), which does not allow any intermediate vertices in its paths; hence,
D(0) is simply the weight matrix of the graph.
 As in Warshall’s algorithm, we can compute all the elements of each matrix D(k) from its
immediate predecessor D(k−1).
 The last matrix in the series, D(n), contains the lengths of the shortest paths amongall paths
that can use all n vertices as intermediate and hence is nothing other thanthe distance matrix.

Page 16
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Let dij(k) be the element in the ith row and the jth column of matrix D(k). This means thatdij(k) is
equal to the length of the shortest path among all paths from the ith vertex vi to the jthvertex vj
with their intermediate vertices numbered not higher than k.

Fig. 4.3.2 Underlying idea of Floyd’s algorithm.


The length of the shortest path can be computed by the following recurrence:

Page 17
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Application of Floyd’s algorithm to the digraph shown:

Note: Updated elements are shown in bold

REVIEW QUESTION ON 4.3.2


1. What is the Floyd algorithm used for?
2. What is the complexity of the Floyd technique?
3. Which type of graph does Floyd's algorithm work on?
4. Solve the all-pairs shortest-path problem for the digraph with the following weight matrix:

Page 18
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
02∞18
6032∞
∞∞04∞
∞∞203
3∞∞∞0

LECTURE 29:

THE GREEDY METHOD:

The greedy method is the straight forward design technique applicable to variety of applications. The
greedy approach suggests constructing a solution through a sequence of steps, each expanding a partially
constructed solution obtained so far, until a complete solution to the problem is reached. On each step
the choice made must be:

As a rule, greedy algorithms are both intuitively appealing and simple. Given an optimization problem,
it is usually easy to figure out how to proceed in a greedy manner, possibly after considering a few small
instances of the problem. What is usually more difficult is to prove that a greedy algorithm yields an
optimal solution (when it does)
The first way is one of the common ways to do the proof for Greedy Technique is by
mathematical induction.
The second way to prove optimality of a greedy algorithm is to show that on each step it does at
least as well as any other algorithm could in advancing toward the problem’s goal.

Example: find the minimum number of moves needed for a chess knight to go from one corner of a
100 × 100 board to the diagonally opposite corner. (The knight’s moves are L-shaped jumps: two
squares horizontally or vertically followed by one square in the perpendicular direction.)
A greedy solution is clear here: jump as close to the goal as possible on each move. Thus, if
its start and finish squares are (1,1) and (100, 100), respectively, a sequence of 66 moves such as (1,
1) − (3, 2) − (4, 4) − . . . − (97, 97) − (99, 98) − (100, 100) solves the problem (The number k of two-
move advances can be obtained from the equation 1+ 3k = 100).
Why is this a minimum-move solution? Because if we measure the distance to the goal by the
Manhattan distance, which is the sum of the difference between the row numbers and the difference
between the column numbers of two squares in question, the greedy algorithm decreases it by 3 on
each move.
Page 19
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
The third way is simply to show that the final result obtained by a greedy algorithm is optimal
based on the algorithm’s output rather than the way it operates.

Example: Consider the problem of placing the maximum number of chips on an 8 × 8 board so that
no two chips are placed on the same or adjacent vertically, horizontally, or diagonally.

(a) Placement of 16 chips on non-adjacent squares. (b) Partition of the boardproving


impossibility of placing more than 16 chips.

It is impossible to place more than one chip in each of these squares, which implies that the total number
of nonadjacent chips on the board cannot exceed 16.

Greedy Technique algorithms are:


 Prim’s algorithm
 Kruskal's Algorithm
 Dijkstra's Algorithm
 Huffman Trees

Two classic algorithms for the minimum spanning tree problem: Prim’s algorithm and Kruskal’s
algorithm. They solve the same problem by applying the greedy approach in two different ways, and
both of them always yield an optimal solution. Another classic algorithm named Dijkstra’s algorithm
used to find the shortest-path in a weighted graph problem solved by Greedy Technique . Huffman
codes is an important datacompression method that can be interpreted as an application of the greedy
technique.

Page 20
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
4.4. Prim’s Algorithm:

The minimum spanning tree problem is the problem of finding a minimum spanning tree for a given
weighted connected graph.

Fig 4.4.1. Graph and its spanning trees, with T1 being the minimum spanning tree.

If we were to try constructing a minimum spanning tree by exhaustive search, we would face two serious
obstacles.
 First, the number of spanning trees grows exponentially with the graph size (at least for dense
graphs).
 Second, generating all spanning trees for a given graph is not easy; in fact, it is more difficult
than finding a minimum spanning tree for a weighted graph.

Page 21
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Prim’s algorithm constructs a minimum spanning tree through a sequence of expanding subtrees.
The initial subtree in such a sequence consists of a single vertex selected arbitrarily from the set V of
the graph’s vertices. On each iteration, the algorithm expands the current tree in the greedy manner by
simply attaching to it the nearest vertex not in that tree. The algorithm stops after all the graph’s vertices
have been included in the tree being constructed

If a graph is represented by its adjacency lists and the priority queue is implemented as a
min-heap, the running time of the algorithm is O(|E| log |V |) in a connected graph, where |V| − 1≤
|E|.

Page 22
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
Application of Prim’s algorithm to the digraph shown:

Note: The parenthesized labels of a vertex in the middle column indicate the nearest tree vertex and
edge weight; selected vertices and edges are in bold.

Page 23
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
REVIEW QUESTION ON .4.4
1. How to solve questions on a greedy algorithm?
2. Which technique is used in the prim’s algorithm?
3. What is the time complexity of Prim’s?
4. How do you solve a spanning tree problem?
5. Does Prim’s algorithm always work correctly on graphs with negative edge weights
LECTURE 30:

4.5.Kruskal’s Algorithm
Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected graph
G= {V, E} as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is the
smallest. the algorithm constructs a minimum spanning tree as an expanding sequence of subgraphs
that are always acyclic but are not necessarily connected on the intermediate stages of the algorithm.
The algorithm begins by sorting the graph’s edges in nondecreasing order of their weights.
Then, starting with the empty subgraph, it scans this sorted list, adding the next edge on the list to
the current subgraph if such an inclusion does not create a cycle and simply skipping the edge
otherwise. Kruskal’s algorithm looks at a minimum spanning tree of a weighted connected
graphG = (V, E) as an acyclic subgraph with |V| − 1 edges for which the sum of the edge weights is
the smallest.

The initial forest consists of |V | trivial trees, each comprising a single vertex of the graph. The final
forest consists of a single tree, which is a minimum spanning tree of the graph. On each iteration, the
algorithm takes the next edge (u, v) from the sorted list of the graph’s edges, finds the trees containing
the vertices u and v, and, if these trees are not the same, unites them in a larger treeby adding the
edge (u, v). Fortunately, there are efficient algorithms for doing so, including the crucial check for
whether two vertices belong to the same tree. They are called union-find algorithms. With an efficient
Page 24
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
union-find algorithm, the running time of Kruskal’s algorithm will be O(|E| log |E|).
Application of Kruskal’s algorithm to the digraph shown:

Note: Selected edges are shown in bold.

Page 25
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

4.5.1. Disjoint Subsets and Union-Find Algorithms

Kruskal’s algorithm is one of a number of applications that require a dynamic partition of some n
element set S into a collection of disjoint subsets S1, S2, . . . , Sk . After being initialized as a collection
of n one-element subsets, each containing a different element of S, the collection is subjected to a
sequence of intermixed union and find operations. (Note that the number of union operations in any
such sequence must be bounded above by n − 1 because each union increases a subset’s size at least
by 1 and there are only n elements in the entire set S.) Thus, we are Greedy Technique dealing here
with an abstract data type of a collection of disjoint subsets of a finite set with the following
operations:

 makeset(x) creates a one-element set {x}. It is assumed that this operation can be applied to
each of the elements of set S only once.

 find(x) returns a subset containing x.

 union(x, y) constructs the union of the disjoint subsets Sx and Sy containing x and y,
respectively, and adds it to the collection to replace Sx and Sy , which are deleted from it.

For example, let S = {1, 2, 3, 4, 5, 6}. Then makeset(i) creates the set {i} and applying this operation
six times initializes the structure to the collection of six singleton sets:
{1}, {2}, {3}, {4}, {5}, {6}.

Performing union(1, 4) and union(5, 2) yields ------------{1, 4}, {5, 2}, {3}, {6},

and, if followed by union(4, 5) and then by union(3, 6), we end up with the disjoint subsets
{1, 4, 5, 2}, {3, 6}.

There are two principal alternatives for implementing this data structure. The first one, called the
quick find, optimizes the time efficiency of the find operation; the second one, called the quick
union, optimizes the union operation.

The quick find uses an array indexed by the elements of the underlying set S; the array’s values
indicate the representatives of the subsets containing those elements. Each subset is implemented as
a linked list whose header contains the pointers to the first and last elements of the list along with the
number of elements in the list, for an example:

Page 26
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Linked-list representation of subsets {1, 4, 5, 2} and {3, 6} obtained by quick find after performing
union(1, 4), union(5, 2), union(4, 5), and union(3, 6).
The lists of size 0 are considered deleted from the collection.

Under this scheme, the implementation of makeset(x) requires assigning the corresponding element
in the representative array to x and initializing the corresponding linked list to a single node with the
x value. The time efficiency of this operation is obviously in Ɵ(1), and hence the initialization of n
singleton subsets is in Ɵ (n). The efficiency of find(x) is also in Ɵ (1): all we need to do is to retrieve
the x’s representative in the representative array. Executing union(x, y) takes longer.
A straightforward solution would simply append the y’s list to the end of the x’s list, update the
information about their representative for all the elements in the y list, and then delete the y’s list
from the collection. It is easy to verify, however, that with this algorithm the sequence of union
operations

runs in Ɵ(n2) time, which is slow compared with several known alternatives.

A simple way to improve the overall efficiency of a sequence of union operations is to always append
the shorter of the two lists to the longer one, with ties broken arbitrarily.

Page 27
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
he size of each list is assumed to be available by, say, storing the number of elements in the list’s
header. This modification is called the union by size. Though it does not improve the worst-case
efficiency of a single application of the union operation (it is still in Ɵ(n)), the worst-case running
time of any legitimate sequence of union-by-size operations turns out to be in O(n log n)

Proof:

Therefore, the total number of possible updates of the representatives for all n elements in S will not
exceed n log 2 n.
Thus, for union by size, the time efficiency of a sequence of at most n – 1 unions and m finds is in
O(n log n + m).

The quick union—the second principal alternative for implementing disjoint subsets—represents each
subset by a rooted tree. The nodes of the tree contain the subset’s elements (one per node), with the
root’s element considered the subset’s representative;
The tree’s edges are directed from children to their parents as shown in the Figure 4.5.1. In addition, a
mapping of the set elements to their tree node implemented, say, as an array of pointers—is maintained.

Page 28
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Figure 4.5.1: (a) Forest representation of subsets {1, 4, 5, 2} and {3, 6} used by quick
union. (b) Result of union(5, 6)

For this implementation, makeset(x) requires the creation of a single-node tree, which is a Ɵ(1)
operation; hence, the initialization of n singleton subsets is in Ɵ(n). A union (x, y) is implemented
by attaching the root of the y’s tree to the root of the x’s tree (and deleting the y’s tree from the
collection by making the pointer to its root null). The time efficiency of this operation is clearly
Ɵ(1). A find(x) is performed by following the pointer chain from the node containing x to the tree’s
root whose element is returned as the subset’s representative. Accordingly, the time efficiency of a
single find operation is in O(n) because a tree representing a subset can degenerate into a linked list
with n nodes.

The better efficiency can be obtained by combining either variety of quick union with path
compression. This modification makes every node encountered during the execution of a find
operation point to the tree’s root

REVIEW QUESTION ON 4.5


1. What technique does the Kruskal’s algorithm use?
2. Which data structure is used in Kruskal’s algorithm?
3. What is the time complexity of Kruskal’s algorithm?
4. Why is it important to maintain disjoint subsets in some algorithms?
5. Give an example of a real-world application of the Union-Find data structure
6. What is the maximum number of times an element's representative can be updated in a sequence
of union-by-size operations?
7. How does the structure of the trees in Union-Find affect the performance of the operations?

Page 29
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

LECTURE 31:

4.6. Dijkstra’s Algorithm


 Dijkstra’s Algorithm solves the single-source shortest-paths problem.


 For a given vertex called the source in a weighted connected graph, find shortest paths to all its
other vertices.
 The single-source shortest-paths problem asks for a family of paths, each leading from the source
to a different vertex in the graph, though some paths may, of course, have edges in common.
 There are several well-known algorithms for finding shortest paths, including Floyd’s algorithm.
Here, we consider the best- known algorithm for the single-source shortest- paths problem,
called Dijkstra’s algorithm. This algorithm is applicable to undirected and directed graphs with
nonnegative weights only.

 Dijkstra’s algorithm finds the shortest paths to a graph’s vertices in order of their distance from
a given source.
 First, it finds the shortest path from the source to a vertex nearest to it, then to a second nearest,
and so on. These vertices, the source, and the edges of the shortest paths leading to them from
the source form a subtree Ti of the given graph.
 Since all the edge weights are nonnegative, the next vertex nearest to the source can be found
among the vertices adjacent to the vertices of Ti . The set of vertices adjacent to the vertices in
Ti can be referred to as ― “fringe vertices”; they are the candidates from which Dijkstra’s
algorithm selects the next vertex nearest to the source.

To identify the ith nearest vertex, the algorithm computes, for every fringe vertex u, the sum of the
distance to the nearest tree vertex v (given by the weight of the edge (v, u)) and the dv of the shortest
path from the source to v (previously determined by the algorithm) and then selects the vertex with
the smallest such sum.

To facilitate the algorithm’s operations, we label each vertex with two labels. The numeric label d
indicates the length of the shortest path from the source to this vertex found by the algorithm so far;
when a vertex is added to the tree, d indicates the length of the shortest path from the source to that

Page 30
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
vertex. The other label indicates the name of the next-to-last vertex on such a path, i.e., the parent
of the vertex in the tree being constructed. (It can be left unspecified for the source s and vertices
that are adjacent to none of the current tree vertices.) With such labeling, finding the next nearest
vertex u∗ becomes a simple task of finding a fringe vertex with the smallest d value. Ties can be
broken arbitrarily.

After we have identified a vertex u∗ to be added to the tree, we need to perform two operations:

The time efficiency of Dijkstra’s algorithm depends on the data structures used for
implementing the priority queue and for representing an input graph itself. It is in Θ (|V |2) for graphs
represented by their weight matrix and the priority queue implemented as an unordered array. For
graphs represented by their adjacency lists and the priority queue implemented as a min- heap, it is
in O(|E| log |V |).
Application of Dijkstra’s Kruskal’s algorithm to the digraph shown

Page 31
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Note: The next closest vertex is shown in bold


The shortest paths (identified by following nonnumeric labels backward from a destination
vertex in the left column to the source) and their lengths (given by numeric labels of the tree vertices)
are as follows:

Page 32
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)
REVIEW QUESTION ON 4.6
1. What is the primary purpose of Dijkstra's algorithm?
2. Does Dijkstra's algorithm work on graphs with negative weights?
3. What type of graph is required for Dijkstra's algorithm (weighted/unweighted)?
4. What is the initial distance value assigned to the starting node in Dijkstra's algorithm?
5. Does Dijkstra's algorithm guarantee the shortest path in all cases?

LECTURE 32:

4.7. Huffman Trees and Codes:


To encode a text that comprises symbols from some n-symbol alphabet by assigning to each
of the text’s symbols some sequence of bits called the codeword. For example, we can use a fixed-
length encoding that assigns to each symbol a bit string of the same length m (m ≥ log2 n). This is
exactly what the standard ASCII code does.
Variable-length encoding, which assigns codewords of different lengths to different
symbols, introduces a problem that fixed-length encoding does not have. Namely, how can we tell
how many bits of an encoded text represent the first (or, more generally, the ith) symbol? To avoid
this complication, we can limit ourselves to the so-called prefix-free (or simply prefix) codes.
In a prefix code, no codeword is a prefix of a codeword of another symbol. Hence, with
such an encoding, we can simply scan a bit string until we get the first group of bits that is a codeword
for some symbol, replace these bits by this symbol, and repeat this operation until the bit string’s end
is reached.

Huffman’s algorithm
Step 1: Initialize n one-node trees and label them with the characters of the alphabet. Record the
frequency of each character in its tree’s root to indicate the tree’s weight. (More generally the weight of
a tree will be equal to the sum of the frequencies in the tree’s leaves)

Step 2: Repeat the following operation until a single tree is obtained. “Find two trees with smallest
weight. Make them the left and right sub-tree of a new tree and record the sum of their weights in the
root of the new tree as its weight”

Page 33
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

A tree constructed by the above algorithm is called a Huffman tree. It defines in the manner
described above is called a Huffman code.

EXAMPLE1 Consider the five-symbol alphabet {A, B, C, D, _} with the following occurrence
frequencies in a text made up of these symbols:
symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15
The Huffman tree construction for this input is shown in Figure 4.7

FIG. 4.7 Example of constructing a Huffman coding tree.

Page 34
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

The resulting codewords are as follows:

symbol A B C D _
frequency 0.35 0.1 0.2 0.2 0.15
codeword 11 100 00 01 101

Hence, DAD is encoded as 011101, and 10011011011101 is decoded as BAD_AD. Withthe


occurrence frequencies given and the codeword lengths obtained, the average number of bits per
symbol in this code is
2 * 0.35 + 3 * 0.1+ 2 * 0.2 + 2 * 0.2 + 3 * 0.15 = 2.25.
We used a fixed-length encoding for the same alphabet, we would have to use at least 3 bits per each
symbol. Thus, for this toy example, Huffman’s code achieves the compression ratio - a standard
measure of a compression algorithm’s effectiveness of (3− 2.25) / 3 *100% = 25%. In other words,
Huffman’s encoding of the text will use 25% less memory than its fixed-length encoding.
Running time is O(n log n), as each priority queue operation takes time O( log n).
EXAMPLE 2:

Page 35
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Page 36
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

Applications of Huffman’s encoding


 Huffman’s encoding is a variable length encoding, so that number of bits used are lesserthan
fixed length encoding.
 Huffman’s encoding is very useful for file compression.
 Huffman’s code is used in transmission of data in an encoded format.
 Huffman’s encoding is used in decision trees and game playing.

REVIEW QUESTION ON 4.7

1. What is Huffman coding used for?


2. What kind of tree is used in Huffman coding?
3. In Huffman coding, what is the first step in creating the code?
4. Does Huffman coding produce a prefix-free code?
5. Can two different sets of data have the same Huffman tree?

Page 37
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

QUESTION BANK-MODULE 4

1. Briefly explain the concept of dynamic programming with example


2. Solve the instance 5, 1, 2, 10, 6 of the coin-row problem.
3. Apply the bottom-up dynamic programming algorithm to the following instance of the knapsack
problem: capacity W = 6.
Item weight value
1 3 $25
2 2 $20
3 1 $15
4 4 $40
5 5 $50
b. How many different optimal subsets does the instance of part (a) have?
c. In general, how can we use the table generated by the dynamic programming algorithm to
tell whether there is more than one optimal subset for the knapsack problem’s instance?
4. Solve the following instance of 0/1 knapsack problem using dynamic programming.
Knapsack capacity is W=5 and n=4

5. Solve the Knapsack instance n=3, {w1, w2, w3} = {1, 2, 2} and {p1, p2, p3} ={18, 16, 6} and
M=4 by dynamic programming.
6. Apply bottom up dynamic programming algorithm for the following instance of the
knapsack problem. Knapsack capacity = 10.

7. Define transitive closure. Write Warshal's algorithm to compute transitive closure. Find its
efficiency.

8. Generate transitive closure of the graph given below.

Page 38
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

9. Trace the following graph using Warshal’s algorithm to find transitive closure.

10. Define transitive closure of a directed graph. Find the transitive closure matrix for the graph
whose adjacency matrix is given.

11. What is dynamic programming. Explain how would you solve all pair shortest path problem
using dynamic programming.
12. Apply floyds algorithm to find the all pair shortest path for the graph given below.

13. Apply floyds algirthm to find the all pair shortest path for the graph given below.

14. Define minimum cost spanning tree. Write Prim's algorithm to find minimum cost spanning tree.
15. Define MST. Write Prim's algorithm to construct minimum cost spanning tree.

16. Write Kruskals algorithm to construct MST. Show that the time efficiency is O(|E|log|E|)

17. Apply PRIMS and KRUSKAL algorithm for the following graph to get MST. Show the
intermediate steps.

Page 39
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

18. Obtain minimum cost spanning tree for the graph whose weight matrix is given below.

19. Apply Prims and Kruskal algorithm to find the MST. Show the intermediatesteps.

20. Write Dijikstra’s algorithm to find single source shortest path OR Write an algorithm to find
single source shortest path

21. Apply Dijikstra’s algorithm to find single source shortest path for the graph given below.
Consider Node 6 as source.

22. Apply Dijikstra’s algorithm to find single source shortest path for the graph given below.
Source vertex is 5.

23. Explain Huffman coding algorithm. With an example show the construction of Huffman tree
and generate the Huffman code using this tree.
24. Construct the Huffman code for the following data.

Also i) encode DAD and ADD ii) Decode 10011011011101

Page 40
ANALYSIS & DESIGN OF ALGORITHMS (BCS401)

25. Construct a Huffman code for the following data:

Encode ABACABAD using the code. Decode 100010111001010

Page 41

You might also like