This is an old revision of the document!
Table of Contents
Chapter 6
Sometimes greedy algorithms do not provide fully correct solutions to a problem that we may be trying to solve. We are forced to resort to other approaches to designing algorithms–one such approach is dynamic programming. In dynamic programming, one implicitly explores the space of all possible solutions, by carefully decomposing things into a series of subproblems, and then building up correct solutions to larger and larger subproblems. We can think of dynamic programming as somewhat of a balance between brute force search and the divide and conquer approach.
6.1 Weighted Interval Scheduling: A Recursive Procedure
We have seen a greedy algorithm that produces an optimal solution to the Interval Scheduling Problem, where the goal is to accept as large a set of nonoverlapping intervals as possible. The Weighted Interval Scheduling Problem is a strictly more general version, in which each interval has a certain value (or weight), and we want to accept a set of maximum value. However, the greedy algorithm that worked before (repeatedly choosing the interval that ends earliest) is no longer optimal in this more general setting.
6.1.1 Designing a Recursive Algorithm
We have n requests labelled 1, …, n, with each request i specifying a start time s_i and a finish time f_i. Each interval i now also has a value, or weight v_i. Two intervals are compatible if they do not overlap. The goal of our current problem is to select a subset of mutually compatible intervals, so as to maximize the sum of the values of the selected intervals. Let’s suppose that the requests are sorted in order of nondecreasing finish time. We’ll say a request i comes before a request j if i < j. We define p(j), for an interval j, to be the largest index i < j such that intervals i and j are disjoint (note: p(j) = 0 if no request i < j is disjoint from j).
Now, consider an optimal solution O to this problem. We can say, with certainty, that the interval n (the last one) either belongs to O, or it doesn't. If n belongs to O, then clearly no interval indexed strictly between p(n) and n can belong to O. On the other hand, if n does not belong to O, then O is simply the optimal solution to the problem consisting of the requests {1, …, n-1}.
Thus, for any value of j between 1 and n, let O_j denote the optimal solution to the problem consisting of the requests {1, …, j}, and let OPT(j) denote the value of the solution. Based on our reasoning earlier, we can say that:
(6.1) OPT(j) = max(v_j + OPT(p(j)), OPT(j - 1))
As such, a request j belongs to an optimal solution to the set {1, 2, …, j} if and only if v_j + OPT(p(j)) ≥ OPT(j - 1).
We therefore have a very simple algorithm that solves the problem for us:
Compute-Opt (j):
If j = 0 then
Return 0
Else
Return max(v_j + Compute-Opt(p(j)), Compute-Opt(j-1))
Endif
Unfortunately, this algorithm runs in exponential running time (as the tree of recursive calls/values gets bigger and bigger–total number of calls made to Compute-Opt on an instance will grow like the Fibonacci numbers, which increase exponentially).
6.1.2 Memoizing the Recursion
A fundamental observation, which forms the second crucial component of a dynamic programming solution, is that our recursive algorithm Compute-Opt is really only solving n+1 different subproblems: Compute-0pt(0), Compute-Opt(1), …, Compute-0pt(n). The fact that it runs in exponential time is simply due to the redundancy in the number of times it issues each of these calls. We could store the value of Compute-0pt in a globally accessible place the first time we compute it and then simply use this precomputed value in place of all future recursive calls in order to drastically improve the runtime. This is called memoization.
The memoized version of our algorithm is as follows:
M-Compute-Opt(j):
If j = 0
then Return 0
Else if M[j] is not empty
then Return M[j]
Else
Define M[j] = max(v_j + M-Compute-Opt(p(j)), M-Compute-Opt(j-1))
Return M[j]
Endif
Since we only ever need to compute the value of M[j] for a particular value of j once, we do at most n such computations. Every other step in the algorithm is O(1). Therefore, the M-Compute-Opt algorithm runs in O(n) time (assuming that the input intervals are sorted by their finish times).
6.1.3 Computing the Solution in Addition to Its Value
So far, we have simply computed the value of an optimal solution. It would be easy to extend M-Compute-0pt so as to keep track of an optimal solution in addition to its value: we can trace back through the array M to find the set of intervals in an optimal solution. The following algorithm does exactly that:
Find-Solution(j):
If j = 0 then
Output nothing
Else
If v_i+ M[p(j)] ≥ M[j-1] then
Output j together with the result of Find-Solution(p(j))
Else
Output the result of Find-Solution(j-1)
Endif
Endif
Since Find-Solution calls itself recursively only on strictly smaller values, it makes a total of O(n) recursive calls; and since it spends constant time per call, the overall running time of the algorithm is O(n).
6.1.4 Comments
This section was pretty interesting. I liked how they introduced a novel concept by straight up diving into an example and explaining things using the example. I had no trouble following the section in class, but even then it was really well-explained in the book. 9/10.
6.2 Principles of Dynamic Programming: Memoization or Iteration over Subproblems
In the previous section, we developed a polynomial-time solution to the Weighted Interval Scheduling Problem by first designing an exponential-time recursive algorithm and then converting it (by memoization) to an efficient recursive algorithm. We now formulate an essentially equivalent version of the algorithm that most explicitly captures the essence of the dynamic programming technique, and will serve as a general template for the algorithms we develop in later sections.
6.2.1 The Iterative Algorithm
We can directly compute the entries in the array M by an iterative algorithm, rather than using memoized recursion. We just start with M[0] = 0 and keep incrementing j; each time we need to determine a value M[j], the answer is provided by (6.1). The algorithm is as follows:
Iterative-Compute-Opt:
M[0] = 0
For j = l, 2, ..., n
M[j] = max(v_i + M[p(j)], M[j-1])
Endfor
The running time of Iterative-Compute-0pt is O(n), since it runs for n iterations and spends constant time in each iteration.
6.2.2 Comments
TODO later
6.3 Segmented Least Squares: Multi-way Choices
In the previous sections, we developed a recurrence based on a fundamentally binary choice: either the interval n belonged to an optimal solution or it didn’t. In the problem we consider in this section, the recurrence will involve what might be called “multi-way choices”: at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution.
6.3.1 The Problem
Suppose our data consists of a set P of n points in the plane, denoted (x_1, y_1), …, (x_n, y_n); and suppose x_1 < x_2 < … < x_n. Given a line L defined by the equation y = ax + b, we say that the error of L with respect to P is the sum of its squared “distances” to the points in P: Error(L, P) = ∑(i=1 → n)(y_i - a.x_i - b)².
We will use p_i to denote the point (x_i, y_i). We must first partition P into some number of segments. Each segment is a subset of P that represents a contiguous set of x-coordinates; that is, it is a subset of the form {p_i, p_{i+1}, …, p{j-1}, p_j} for some indices i ≤ j. Then, for each segment S in our partition of P, we compute the line minimizing the error with respect to the points in S.
The penalty of a partition is defined to be a sum of the following terms:
- The number of segments into which we partition P, multiplied by a fixed constant C > 0.
- For each segment, the error value of the optimal line through that segment.
Our goal in the Segmented Least Squares Problem is to find a partition of minimum penalty.
6.3.1 Designing the Algorithm
Suppose we let OPT(i) denote the optimum solution for the points p_1, …, p_i, and we let e_{i,j} denote the minimum error of any line with respect to p_i, p_{i+1}, p_j. (Note: we define OPT(0) = 0.) Then it follows that if the last segment of the optimal partition is p_i, …, p_n, then the value of the optimal solution is OPT(n) = e_{i,n} + C + OPT(i - 1).
And, for the subproblem on the points p_1, …, p_j, OPT(j) = min{1 ≤ i ≤ j} [e_{i, j} + C + OPT(i - 1))] (6.7).
We can then come up with the following algorithm:
Segmented-Least-Squares(n):
Array M[O... n]
Set M[0]= 0
For all pairs i ≥ j
Compute the least squares error e_{i,j} for the segment p_i, ..., p_j
Endfor
For j = 1, 2, ..., n
Use the recurrence in (6.7) to compute M[j]
Endfor
Return M[n]
Find-Segments(j):
If j = 0 then
Output nothing
Else
Find an ithat minimizes e_{i,j} + C + M[i-1]
Output the segment {p_i, ..., p_j} and the result of Find-Segments(i-1)
Endif
Finally, we consider the running time of Segmented-Least-Squares. First we need to compute the values of all the least-squares errors e_{i,j}. To perform a simple accounting of the running time for this, we note that there are O(n^2) pairs (i, j) for which this computation is needed; and for each pair (i, j), we can use the formula given at the beginning of this section to compute e_{i,j} in O(n^3) time. Thus the total running time to compute all e_{i,j} values is O(n^3).
For Find-Segments, the algorithm has n iterations, for values j = 1, …, n. For each value of j, we have to determine the minimum in the recurrence (6.7) to fill in the array entry M[j]; this takes time O(n) for each j, for a total Of O(n^2).
