Outline of Kruskal's Algorithm
Outline of Kruskal's Algorithm
Kruskal's Algorithm for computing the minimum spanning tree is directly based on
the generic MST algorithm. It builds the MST in forest. Initially, each vertex is in its
own tree in forest. Then, algorithm consider each edge in turn, order by increasing
weight. If an edge (u, v) connects two different trees, then (u, v) is added to the set of
edges of the MST, and two trees connected by an edge (u, v) are merged into a single
tree on the other hand, if an edge (u, v) connects two vertices in the same tree, then
edge (u, v) is discarded.
Make_Set (v)
Create a new set whose only member is pointed to by v. Note that for this
operation v must already be in a set.
FIND_Set
Returns a pointer to the set containing v.
UNION (u, v)
Unites the dynamic sets that contain u and v into a new set that is union of these
two sets.
MST_KRUSKAL (G, w)
1. A ← {} // A will ultimately contains the
edges of the MST
2. for each vertex v in V[G]
3. do Make_Set (v)
4. Sort edge of E by nondecreasing weights w
5. for each edge (u, v) in E
6. do if FIND_SET (u) ≠ FIND_SET (v)
7. then A = AU{(u, v)}
8. UNION (u, v)
9. Return A
Analysis
The for-loop in lines 5-8 per forms 'Union' and 'find' operations in O(|E|) time.
When the disjoint-set forest implementation with the weighted union and path
compression heuristic is used, the O(|E|) 'union' and 'find' operators involved in the
for-loop in lines 5-8 will have worst-case time O(|E| lg * |E|). Kruskal's algorithm
requires sorting the edge of E in line 4. We know that the fastest comparison-based
sorting algorithm will need (|E| lg |E|) time to sort all the edges. The time for
sorting all the edges of G in line 4 will dominates the time for the 'union' and 'find'
operations in the for-loop in lines 5-8. When comparison-based sorting algorithm is
used to perform sorting in line 4. In this case, the running time of Kruskal's algorithm
will be O(|E| lg |E|).
If the edges weights are integers ranging from 1 to constant W, we can again use
COUNTING_SORT, which again runs in O(W + E) = O(E) time. Note
taht O(W + E) = O(E) because W is a constant. Therefore, the asymptotic
bound of the Kruskal's algorithm is also O(E α (E, V)).
MST_KRUSKAL (G)
1. for each vertex v in V[G] do
2. define set S(v) ← {v}
3. Initialize priority queue Q that contains all
edges of G, using the weights as keys.
4. A ← { } // A will ultimately contains
the edges of the MST
5. While A has less than n-1 edges do
6. Let set S(v) contains v and S(u) contain u
7. IF S(v) != S(u) then
Add edge (u, v) to A
Merge S(v) and S(u) into
one set i.e., union
8. Return A
Analysis
Step 1. In the graph, the Edge(g, h) is shortest. Either vertex g or vertex h could be
representative. Lets choose vertex g arbitrarily.
Step 2. The edge (c, i) creates the second tree. Choose vertex c as representative for
second tree.
Step 3. Edge (g, g) is the next shortest edge. Add this edge and choose vertex g as
representative.
Step 5. Add edge (c, f) and merge two trees. Vertex c is chosen as the representative.
Step 6. Edge (g, i) is the next next cheapest, but if we add this edge a cycle would be
created. Vertex c is the representative of both.
Prim's Algorithm
Like Kruskal's algorithm, Prim's algorithm is based on a generic MST algorithm. The
main idea of Prim's algorithm is similar to that of Dijkstra's algorithm for finding
shortest path in a given graph. Prim's algorithm has the property that the edges in the
set A always form a single tree. We begin with some vertex v in a given graph G =(V,
E), defining the initial set of vertices A. Then, in each iteration, we choose a
minimum-weight edge (u, v), connecting a vertex v in the set A to the vertex u outside
of set A. Then vertex u is brought in to A. This process is repeated until a spanning
tree is formed. Like Kruskal's algorithm, here too, the important fact about MSTs is
we always choose the smallest-weight edge joining a vertex inside set A to the one
outside the set A. The implication of this fact is that it adds only edges that are safe
for A; therefore when the algorithm terminates, the edges in set A form a MST.
Algorithm
MST_PRIM (G, w, v)
1. Q ← V[G]
2. for each u in Q do
3. key [u] ← ∞
4. key [r] ← 0
5. π[r] ← NIl
6. while queue is not empty do
7. u ← EXTRACT_MIN (Q)
8. for each v in Adj[u] do
9. if v is in Q and w(u, v) < key [v]
10. then π[v] ← w(u, v)
11. key [v] ← w(u, v)
Analysis
The performance of Prim's algorithm depends of how we choose to implement the
priority queue Q.
2
Definitions Sparse graphs are those for which |E| is much less than |V| i.e., |E|
<< |V|2 we preferred the adjacency-list representation of the graph in this case. On
the other hand, dense graphs are those for which |E| is graphs are those for which |
E| is close to |V|2. In this case, we like to represent graph with adjacency-matrix
representation.
The time taken by Prim's algorithm is determined by the speed of the queue
operations. With the queue implemented as a Fibonacci heap, it takes O(E +
VlgV) time. Since the keys in the priority queue are edge weights, it might be
possible to implement the queue even more efficiently when there are restrictions on
the possible edge weights. If the edge weights are integers ranging from 1 to some
constant w, we can speed up the algorithm by implementing the queue as an
array Q[0 . . . w+1] (using the w+1 slot for key = ∞), where each slot holds a
doubly linked list of vertices with that weight as their key. . Then EXTRACT_MIN
takes only O(w) = O(1) time (just scan for the first nonempty slot), and
DECREASE_KEY takes only O(1) time (just remove the vertex from the list its in
and insert it at the front of the list indexed by the new key). This gives a total running
time of O(E), which is best possible asymptotic time (since Ω(E) edges must be
processed).
However, if the edge-weights are integers ranging from 1 to |V|, the array
implementation does not help. The operation DECREASE_KEY would still be
reduced to constant time, but operation EXTRACT_MIN would now use O(V) time,
for the total running time of O(E+ V2). Therefore, we are better off sticking with
the Fibonacci-heap implementation, which takes O(E + VlgV) time. To get any
advantage out of the integer weights, we would have to use data structure we have not
studied in the CLR, such as
Instead of heap structure, we use an array to store the key of each node
1. A ← V[g] // Array
2. For each vertex uQ do
3. key [u] ∞
4. key [r] ← 0
5. π[r] ← NIL
6. while Array A empty do
7. scan over A find the node u with smallest
key, and remove it from array A
8. for each vertex vAdj[u]
9. if vA and w[u, v] < key[v] then
10. π[v] ← u
11. key[v] ← w[u, v]
Analysis
For-loop (line 8-11) takes O(deg[u]) since line 10 and 11 take constant time. Due
to scanning whole array A, line 7 takes O(V) timelines 1-5, clearly (V).
Therefore, the while-loop (lines 6-11) needs
O(∑u(V+dg[u]))= O(V2+E)
= O(V2)