0% found this document useful (0 votes)
2K views

hw09 Solution PDF

The document provides solutions to four problems related to dynamic programming algorithms. It summarizes an algorithm to solve a modified rod-cutting problem that accounts for a fixed cut cost. It also provides algorithms to find the longest monotonically increasing subsequence in O(n2) time and O(n log n) time. Finally, it describes a dynamic programming algorithm to neatly print a paragraph with a fixed character width in minimal lines.

Uploaded by

siddharth1k
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2K views

hw09 Solution PDF

The document provides solutions to four problems related to dynamic programming algorithms. It summarizes an algorithm to solve a modified rod-cutting problem that accounts for a fixed cut cost. It also provides algorithms to find the longest monotonically increasing subsequence in O(n2) time and O(n log n) time. Finally, it describes a dynamic programming algorithm to neatly print a paragraph with a fixed character width in minimal lines.

Uploaded by

siddharth1k
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Fundamental Algorithms

CSCI-GA.1170-001/Summer 2016
Solution to Homework 9
Problem 1 (CLRS 15.1-3). (1 point) Consider a modification of the rod-cutting problem in
which, in addition to a price pi for each rod, each cut incurs a fixed cost of c. The revenue
associated with a solution is now the sum of the prices of the pieces minus the costs of making
the cuts. Give a dynamic-programming algorithm to solve this modified problem.
Solution: We can modify B OTTOM-UP-CUT-ROD algorithm from section 15.1 as follows:
B OTTOM-UP-CUT-ROD(p, n, c)
1 let r[0..n] be a new array
2 r[0] = 0
3 for j = 1 to n
4
q =
5
for i = 1 to j 1
6
q = max(q, p[i] + r[ j i] c)
7
r[ j] = max(q, p[ j])
8 return r[n]
We need to account for cost c on every iteration of the loop in lines 5-6 but the last one, when
i = j (no cuts). We make the loop run to j 1 instead of j, make sure c is subtracted from the
candidate revenue in line 6, then pick the greater of current best revenue q and p[ j] (no cuts)
in line 7.
Problem 2 (CLRS 15.4-5). (1 point) Give an O(n2 )-time algorithm to find the longest monotonically increasing subsequence of a sequence of n numbers.
Solution: We observe that the longest monotonically increasing subsequence (LIS) of sequence
S is a longest common subsequence (LCS) of S and sorted S. For a sequence of length n, sorting
can be done in O(n lg n) time, and finding the LCS in O(n2 ) time. Utilizing the LCS-LENGTH
and PRINT-LCS procedures from section 15.4, we can find the LIS as follows:
LIS(S)
1 S 0 = SORT(S)
2 c, b = LCS-LENGTH(S, S 0 )
3 PRINT-LCS(b, S, S.leng th, S.leng th)
Problem 3 (CLRS 15.4-6). (2 points) Give an O(n lg n)-time algorithm to find the longest
monotonically increasing subsequence of a sequence of n numbers. (Hint: Observe that the
last element of a candidate subsequence of length i is at least as large as the last element of
a candidate subsequence of length i 1. Maintain candidate subsequences by linking them
through the input sequence.)
1

Solution: We build our algorithm around two key insights:


We can keep track of candidate subsequences of X using array M such that M [ j] = k
indicates X [k] being the smallest value for which there is a monotonically increasing
subsequence of length j ending with X [k]. For example, for X = 2, 3, 13, 11, 5, 7, we
have M = NIL, 0, 1, 4, 5.
With L representing the length of the longest increasing subsequence found so far, sequence X [M [1]], X [M [2]], ...X [M [L]] is monotonically increasing. This allows us to
search in this sequence in O(lg n) time.
Our algorithm makes a single pass through X , maintaining array M by locating the largest
j L such that X [M [ j]] X [i] for the current i. We also maintain array P, in which
P[k] = q indicates X [q] being the predecessor of X [k] in the longest monotonically increasing
subsequence ending with X [k]; we use this array to reconstruct the solution. The following pseudocode is based on pseudocode from https://en.wikipedia.org/wiki/Longest_
increasing_subsequence#Efficient_algorithms:
COMPUTE-LIS(X , n)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17

let P[0..n] be a new array


let M [0..n + 1] be a new array
L=0
for i = 0 to n
lo = 1
hi = L
while lo hi
mid = lo + b(hi lo)/2c
if X [M [mid]] X [i]
lo = mid + 1
else
hi = mid 1
newL = lo
P[i] = M [newL 1]
M [newL] = i
L = max(L, newL)
return P, M , L

PRINT-LIS(X , P, M , L)
1 let S[0..L] be a new array
2 k = M [L]
3 for i = L 1 to 0
4
S[i] = X [k]
5
k = P[k]
Problem 4 (CLRS 15-4). (3 points) Consider the problem of neatly printing a paragraph
with a monospaced font (all characters having the same width) on a printer. The input text
2

is a sequence of n words of lengths l1 , l2 , ..., l n , measured in characters. We want to print


this paragraph neatly on a number of lines that hold a maximum of M characters each. Our
criterion of "neatness" is as follows. If a given line contains words i through j, where i j,
and we leave exactly one space between words, the number of extra space characters at the
Pj
end of the line is M j + i k=i l k , which must be nonnegative so that the words fit on the
line. We wish to minimize the sum, over all lines except the last, of the cubes of the numbers
of extra space characters at the ends of lines. Give a dynamic-programming algorithm to print
a paragraph of n words neatly on a printer. Analyze the running time and space requirements
of your algorithm.
Pj
Solution: If we define ex t r as[i, j] = M j + i k=i l k to be the number of extra space
characters at the end of the line containing words i through j, we can define the following line
cost function for the line containing words i through j:

lc[i, j] =

(ex t r as[i, j])3

if ex t r as[i, j] < 0,
if j = n and ex t r as[i, j] 0,
otherwise.

Negative ex t r as[i, j] indicates that words i through j dont fit; this should never occur
in a correct solution and so we assign this case an infinite cost.
We dont need to minimize ex t r as[i, j] in the last line, so any non-negative value is an
acceptable solution.
The remaining case specifies the cost function according to the problem statement.
We note that the problem exhibits optimal substructure: If an arrangement of words 1, ..., j
with the last line containing words i, ..., j is optimal, the preceding lines contain an optimal
arrangement of words 1, ..., i 1.
Let us define c[ j] to be the cost of an optimal arrangement of words 1, ..., j. Following the
optimal substructure argument above, c[ j] = c[i1]+lc[i, j]. Enumerating all possible choices
for i (the first word on the last line for a given subproblem) gives the following recursive
definition for c[ j]:

0
if j = 0,
c[ j] =
min1i j (c[i 1] + lc[i, j]) otherwise.
To be able to reconstruct the actual solution, we also record the arrangement in array p such
that p[ j] = k indicates that c[ j] ended up picking c[k 1] + lc[k, j] for the optimal solution.
This way, the last line of the final arrangement contains words p[n], ..., n, the line before last
words p[p[n]], ..., p[n] 1, and so on.
Implementing each of the steps above as a separate procedure gives:

COMPUTE-EXTRAS(l, n, M )
1 let e x t r as[1..n, 1..n] be a new array
2 for i = 1 to n
3
// One-word line, so just width minus word length.
4
e x t r as[i, i] = M l[i]
5
for j = i + 1 to n
6
// Previous minus new word length minus space between words.
7
ex t r as[i, j] = ex t r as[i, j 1] l[ j] 1
8 return ex t r as
COMPUTE-LINE-COST(ex t r as, n)
1 let l c[1..n, 1..n] be a new array
2 for i = 1 to n
3
for j = i to n
4
if ex t r as[i, j] < 0
5
// Words dont fit.
6
lc[i, j] =
7
elseif j = n and ex t r as[i, j] 0
8
// Last line and at least zero trailing spaces.
9
lc[i, j] = 0
10
else
11
// Normal cost function.
12
lc[i, j] = (ex t r as[i, j])3
13 return lc
COMPUTE-COST(lc, n)
1 let c[1..n] be a new array
2 c[0] = 0
3 for j = 1 to n
4
c[ j] =
5
for i = 1 to j
6
if c[i 1] + lc[i, j] < c[ j]
7
c[ j] = c[i 1] + lc[i, j]
8
p[ j] = i
9 return c, p
PRINT-LINES(p, j)
1
2
3
4
5
6
7
8

i = p[ j]
if i = 1
k=1
else
k = PRINT-LINES(p, i 1) + 1
// Print words i through j on line k.
PRINT(k, i, j)
return k
4

PRINT-PARAGRAPH(l, n, M )
1 e x t r as = COMPUTE-EXTRAS(l, n, m)
2 l c = COMPUTE-LINE-COST(ex t r as, n)
3 c, p = COMPUTE-COST(lc, n)
4 PRINT-LINES(p, n)
The algorithm runs in (n2 ) time and requires (n2 ) space. Both characteristics can be improved to (nM ) by observing that at most dM /2e words can fit on a line (each word being at
least one character long plus spaces between words), and only computing and storing ex t r as
and l c for j i + 1 dM /2e.
Problem 5 (CLRS 15-5). (4 points) See CLRS 15-5 for full problem statement.
(a) Given two sequences x[1..m] and y[1..n] and set of transformation-operation costs, the
edit distance from x to y is the cost of the least expensive operation sequence that transforms x to y. Describe a dynamic-programming algorithm that finds the edit distance
from x to y and prints an optimal operation sequence. Analyze the running time and
space requirements of your algorithm.
Solution:
Let us define X i = x[1..i] and Y j = y[1.. j] to be prefixes of sequences x and y, X i Y j
to be a problem of determining the cost of the least expensive operation sequence that
transforms X i to Y j , and c[i, j] to be that cost.
We observe that the problem exhibits optimal substructure: an optimal solution to X p
Yq includes optimal solutions to X i<p Y j<q .
We now consider different possibilities for the last operation in the optimal solution to
X i Yj :
Copy. Then x[i] = y[ j], the remaining subproblem is X i1 Y j1 and c[i, j] =
c[i 1, j 1] + cost(cop y).
Replace. Then x[i] 6= y[ j], the remaining subproblem is X i1 Y j1 and c[i, j] =
c[i 1, j 1] + cost(r eplace).
Delete. Then we have no restrictions on x[i] and y[ j], the remaining subproblem
is X i1 Y j and c[i, j] = c[i 1, j] + cost(delet e).
Insert. Then we have no restrictions on x[i] and y[ j], the remaining subproblem is
X i Y j1 and c[i, j] = c[i, j 1] + cost(inser t).
Twiddle. Then x[i] = y[ j 1] and x[i 1] = y[ j] for i, j 2, the remaining
subproblem is X i2 Y j2 and c[i, j] = c[i 2, j 2] + cost(t widdle).
Kill. This must be the final operation, so the current problem must be X m Yn .
We can kill the string starting from any 0 i < m and so c[i, j] = c[m, n] =
min0i<m (c[i, n]) + cost(kill).

We can now provide the following recursive definition for c[i, j]:

c[i 1, j 1] + cost(cop y)
if x[i] = y[ j],

c[i 1, j 1] + cost(r eplace) if x[i] 6= y[ j],

c[i 1, j] + cost(delet e)
in all cases,
c[i, j] = min

c[i, j 1] + cost(inser t)
in all cases,

c[i 2, j 2] + cost(t widdle) if i, j 2, x[i] = y[ j 1], x[i 1] = y[ j],

min0i<m (c[i, n]) + cost(kill) if i = m, j = n.


In addition, we note that X 0 Y j conversion from an empty string can be viewed as
a sequence of j inserts, and X i Y0 conversion to an empty string as a sequence of i
deletes.
We can now convert the definition above into the following dynamic programming algorithm for computing edit distance and an optimal operation sequence:

COMPUTE-EDIT-DISTANCE(x, y, m, n)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41

let c[0..m, 0..n] be a new array


let op[0..m, 0..n] be a new array

// Converting to an empty string.


for i = 0 to m
c[i, 0] = i COST(DELETE)
op[i, 0] = DELETE
// Converting from an empty string.
for j = 0 to n
c[0, j] = j COST(INSERT)
op[0, j] = INSERT
// All cases other than KILL.
for i = 1 to m
for j = 1 to n
c[i, j] =
if x[i] = y[ j]
c[i, j] = c[i 1, j 1] + COST(COPY)
op[i, j] = COPY
if x[i] 6= y[ j] and c[i 1, j 1] + COST(REPLACE) < c[i, j]
c[i, j] = c[i 1, j 1] + COST(REPLACE)
op[i, j] = REPLACE + y[ j]
if c[i 1, j] + COST(DELETE) < c[i, j]
c[i, j] = c[i 1, j] + COST(DELETE)
op[i, j] = DELETE
if c[i, j 1] + COST(INSERT) < c[i, j]
c[i, j] = c[i, j 1] + COST(INSERT)
op[i, j] = INSERT
if i 2 and j 2 and x[i] = y[ j 1] and x[i 1] = y[ j]
and c[i 2, j 2] + COST(TWIDDLE) < c[i, j]
c[i, j] = c[i 2, j 2] + COST(TWIDDLE)
op[i, j] = TWIDDLE
// KILL.
for i = 0 to m 1
if c[i, n] + COST(KILL) < c[m, n]
c[m, n] = c[i, n] + COST(KILL)
op[m, n] = KILL + i
return c, op

The algorithm fills an m n table, spending constant time on each cell, so the running
time and space are both (mn).
7

We can reconstruct the actual sequence of operations using the following procedure:
PRINT-OPERATIONS(op, i, j)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

if i = 0 and j = 0
return
if op[i, j] = COPY or op[i, j] = REPLACE
i0 = i 1
j0 = j 1
elseif op[i, j] = DELETE
i0 = i 1
j0 = j
elseif op[i, j] = INSERT
i0 = i
j0 = j 1
elseif op[i, j] = TWIDDLE
i0 = i 2
j0 = j 2
else // KILL
i 0 = GET-KILL-INDEX(op[i, j])
j0 = j
PRINT-OPERATIONS(op, i 0 , j 0 )
PRINT(op[i, j])

(b) Explain how to cast the problem of finding an optimal DNA alignment as an edit distance
problem using a subset of the transformation operations copy, replace, delete, insert,
twiddle, and kill.
The DNA alignment problem can be reduced to the edit distance problem by taking the
following operation costs:

COST (COPY) = 1

to account for the x 0 [ j] = y 0 [ j] and neither is space case.

COST (REPLACE) = +1

COST (DELETE) = +2

to account for the x 0 [ j] 6= y 0 [ j] and neither is space case.

and COST(INSERT) = +2 to account for the x 0 [ j] or y 0 [ j] is a

space case.

COST (TWIDDLE) = and


not permitted.

COST (KILL)

= . In other words, the operations are

Then, the negative of the cost minimized by COMPUTE-EDIT-DISTANCE is the score to


maximize in the DNA alignment problem.

You might also like