Design and Analysis
Design and Analysis
1 3
Then BFS from vertex 1 produces the tree on the left if vertex 2 is added before
vertex 4, but it produces the tree on the right if vertex 4 is added before vertex 2:
1 1
2 4 4 2
3 3
1
(d) [2 points] Recall that in the interval scheduling problem, we are given a list of start
times si and finish times fi for all i ∈ {1, 2, . . . , n}, and our goal is to find a largest
possible subset of the intervals [si , fi ] such that no two intervals overlap. Consider
the following greedy algorithm for interval scheduling:
Sort the intervals by starting times
While there are any remaining intervals
Add an interval with the latest starting time to the schedule
Delete all conflicting intervals
Endwhile
True or false: This algorithm outputs an optimal schedule.
Solution: True. This is equivalent to the algorithm discussed in class (schedule in
increasing order of finishing time) if we reverse the direction of time, and this reversal
does not affect the maximum number of intervals that can be scheduled.
We can compute ci for every i in time O(m + n) since for every vertex we consider all of
its neighbors, and do a constant amount of additional work. Once we have computed ci
for all i, the final answer is simply maxi ci , which can be computed in time O(n). Overall,
the running time of the algorithm is O(m + n).
2
3. Borůvka’s algorithm.
In 1926, Otakar Borůvka published a method for constructing an efficient electrical net-
work for the country of Moravia. This minimum spanning tree algorithm predates both
the Kruskal and Prim algorithms. Given a connected, undirected input graph G with
positive edge weights w such that no two edges have the same weight, it works as follows:
Borůvka(G, w):
Let F be a forest with the same vertices as G, but no edges
While F has more than one component
Let S be an empty set
For each component C of F
Let {u, v} be the lowest-weight edge with u in C and v not in C
Add edge {u, v} to S
Endfor
Add the edges in S to the forest F
Endwhile
In both parts of this problem, you should assume that G is a connected, undirected graph
and that no two edges of G have the same weight.
(a) [9 points] Prove that Borůvka’s algorithm outputs a minimum spanning tree of G.
(You are not asked to analyze its running time.)
Solution: First we show that every edge added by the algorithm must be part of
any minimum spanning tree. Recall the lemma we used to show this for the Kruskal
and Prim algorithms: the lowest-weight edge crossing any nontrivial cut in G must
be part of any minimum spanning tree of G. The edge {u, v} added by the algorithm
is the lowest-weight edge crossing the cut (C, V \ C), so it satisfies the conditions of
the lemma, and hence is part of any minimum spanning tree.
It remains to show that the algorithm outputs a spanning tree. Adding a single edge
between two components of a forest does not create a cycle. It is possible that when
we consider component C, we add some edge joining it to another component C 0 ,
and when we consider C 0 , we add some edge joining it to C. However, in this case the
two edges must be the same, since in both cases we choose the lowest-weight edge.
Thus the algorithm never creates a cycle. Finally, the algorithm does not terminate
until it has a single component, so the final output must be a spanning tree.
(b) [1 point] True or false: Borůvka’s algorithm and Prim’s algorithm always output the
same minimum spanning tree.
Solution: True: A graph with distinct edge weights has a unique minimum spanning
tree.
3
4. Scheduling to maximize profit. [10 points]
Suppose you are given n jobs, where job i has deadline di (a positive integer) and profit
pi (a positive real number) for all i ∈ {1, 2, . . . , n}. Each job takes unit time, so it may be
scheduled in any interval [si , si + 1] for some nonnegative integer si provided si + 1 ≤ di
(i.e., the job is finished by its deadline) and no other job is scheduled during the same
interval. Your goal is to schedule a subset of the jobs to maximize the total profit, which
is the sum of pi over all scheduled jobs i. Design a greedy algorithm for this problem.
Prove that your algorithm finds an optimal schedule, and analyze its running time.
Solution: Algorithm: Sort the jobs in order of decreasing profit and go through the jobs
in this order, scheduling each job at the latest possible time (if any such time is available).
Running time: This algorithm can be implemented in time O(n2 ) since sorting takes time
O(n log n), and then we run through all n jobs, performing O(n) operations to check the
possible times at which they might be scheduled. (In fact, you can get running time
O(n log n) with a more careful implementation, but you are not asked to optimize the
algorithm.)
Correctness: We prove optimality of the greedy schedule with an exchange argument. Let
(g1 , . . . , g` ) be the jobs scheduled by the greedy algorithm in order of decreasing profit,
where job gi is scheduled at time ti . Let (b1 , . . . , bm ) be the jobs in some optimal schedule,
again in order of decreasing profit, where job bi is scheduled at time ui . Let j be the first
index at which the two schedules differ, i.e., the smallest index so that either gj 6= bj or
(gj = bj with tj 6= uj ). We consider these two cases separately:
If gj = bj with tj 6= uj , then uj < tj , since the greedy algorithm schedules jobs at the latest
possible time. Now adjust the optimal schedule by changing the time of bj from uj to tj ,
and if uk = tj for some k, changing the time of bk from uk to uj . The resulting schedule
is still valid because bj is moved to a time that is still before the deadline (namely, the
time used in the greedy schedule) and bk is moved to an earlier time. It includes the same
jobs, so it has the same total profit and hence is still optimal.
If gj 6= bj , then the profit of gj is at least that of bj (otherwise the greedy algorithm
would have added bj next), and job gj does not appear in the optimal schedule (since
the jobs are sorted in decreasing order of profit, and we can assume that ties are broken
in some consistent way). Then we add gj to the optimal schedule at time tj , deleting
any job previously scheduled at time tj if such a job exists in the optimal schedule. This
schedule is still valid because tj is before the deadline for gj (by the definition of the
greedy algorithm). It is still optimal because, if there was a job previously in the optimal
schedule at time tj , then it must have been one of the jobs bj , . . . , bm , and these jobs all
have profit no more than that of gj .
In either case, we adjusted the schedule so that gj = bj and tj = uj . Continuing in this
way, we ensure that the first ` jobs of the optimal schedule are identical to those of the
greedy schedule. If there were any remaining jobs in the optimal schedule (i.e., if m > `)
then there would be at least one more job that the greedy algorithm could add to increase
its profit. Since this is not the case, we must have m = `, and we have transformed
the optimal schedule into the greedy one without decreasing its profit. Thus the greedy
schedule is optimal.
4
5. Maximum ordered ratio. [10 points]
Suppose you are given as input a sequence of numbers a1 , a2 , . . . , an with n ≥ 2. Your
goal is to find the largest ratio between two of these numbers where the numerator occurs
after the denominator in the sequence. In other words, you would like to compute
ai
max : i, j ∈ {1, 2, . . . , n} with i > j .
aj
Describe a divide-and-conquer algorithm to solve this problem. Prove its correctness and
analyze its running time. For full credit, your algorithm should run in time at most
O(n log n).
Solution: A natural divide-and-conquer algorithm proceeds as follows:
MOR(a1 , . . . , an ):
If n = 2, return a2 /a1
If n = 3, return max{a2 /a1 , a3 /a2 , a3 /a1 }
Let ` = MOR(a1 , . . . , abn/2c )
Let r = MOR(abn/2c+1 , . . . , an )
Let u = min{a1 , . . . , abn/2c }
Let v = max{abn/2c+1 , . . . , an }
Return max{`, r, v/u}
(Note that either n = 2 or n = 3 may occur as a base case when we recursively split the
list in half, although this is a relatively minor detail.)
Clearly the maximum ordered ratio of the input sequence is either the largest such ratio in
the first half, the largest such ratio in the second half, or the ratio of the largest number
in the second half to the smallest number in the first half. This is precisely what the
algorithm computes, so it is correct.
To determine the running time, observe that the min and max can both be computed in
linear time, so we have the recurrence T (n) = 2T (n/2) + O(n) for the running time T (n),
and we know that this has solution T (n) = O(n log n).
In fact, it is possible to reduce the running time to O(n) by a slightly more clever approach
(say, using dynamic programming).