This is an old revision of the document!

Chapter 6 - Dynamic Programming

Section 6.1

Summary & Motivations: Section 6.1 is Weighted Interval Scheduling: A Recursive Procedure. In this section, we revisit the weighted interval scheduling problem from Chapter 1.2. Whereas before we wanted to pick non-overlapping intervals, now we want to maximize the value we derive from the intervals. The solution to this problem rests on the notion that each interval either is or is not in the optimal solution. If it is, the optimal solution's value equals that interval's value plus the optimal solution of the subproblem derived from the next interval that is compatible with the one in question. If the interval is not in the optimal solution, the optimal solution's value equals the optimal solution of the next interval (NOT the next compatible one, just the next one).
About the Algorithms: In order to solve this problem, we have to give a few definitions. We say that p(j) is the index i < j such that interval i is compatible with interval j. Further, we say that OPT(j) is the value of the optimal solution to requests 1 through j. Considering the choice between whether an interval is or is not included, mentioned in the Summary & Motivations section, OPT(j) = max(v_j + OPT(p(j)), OPT(j-1)). We see here the idea of subproblems. Each optimal solution involves knowing about the optimal solution to subproblems. This definition, though it yields a recursive algorithm, yields an algorithm with exponential runtime in the worst case. We can solve that problem by using memoization, though, since the algorithm is computing the same n+1 values multiple times. Assuming the input intervals are sorted by finish time, the memoized algorithm is O(n). We find the specific set of intervals in the optimal solution by back tracking through memoization array, using the intuition that an interval j is included if and only if v_j + OPT(p(j)) >= OPT(j-1).
My Questions: So far, it's not completely clear to me what the differences are between dynamic programming problems and divide and conquer problems.
Second Time Around: I didn't understand the concept of p(j) completely the first time in class, but it makes sense now.
Note to Self: When an algorithm's implementation does not have an obvious or explicit upper bound on the number of calls, look for a measure of progress (p 257).
Readability: 8 - I thought this was pretty easy to read.

Section 6.2

Summary & Motivations: Section 6.2 is Principles of Dynamic Programming: Memoization or Iteration over Subproblems. As the title suggests, this short subsection is about iteration vs recursion with memoization for dynamic programming problems. We conclude in this subsection that all dynamic programming problems (or, at least the ones explored in this chapter) can be represented either as a memoized recursive algorithm or as an iterative algorithm, but we choose to henceforth talk about the iterative formulations.
About the Algorithms: This time, instead of memoizing and recusing, we simply start with M[0] = 0, and add to the array iteratively using the same rule as before: M[j] = max(v_j + M[p(j)], M[j-1]). We still use Find-Solution from Section 6.1 to find the set of intervals in the optimal solution, as we did when the algorithm was written recursively. This algorithm is “clearly” O(n) (p 259).
My Questions: How do we know when we should use a dynamic programming approach to solving a problem? Just when we can think of ways to break the problem into subproblems, and the subproblems satisfy the rules on page 260?
Second Time Around: The iterative approach makes concrete sense the second time around.
Note to Self: Sometimes it might be easier to start with a recursive approach, but at least for the problems in this chapter (homework!), there's also a iterative way to write the algorithm.
Readability: 9 - this section was short and sweet.

Section 6.3

Summary & Motivations: Section 6.3 is Segmented Least Squares: Multi-way Choices. In this section, we take a page out of the statistics book. Given a set of data points, ordered by their x-coordinate, we want to determine the best fit lines (plural!) that minimize the total error plus total cost (where lines have an associated cost).
About the Algorithms: The subproblems for this algorithm are the segments. 6.6 says if the last segment of the optimal solution is p_i, …, p_n, then the value of the optimal solution is OPT(n) = e_{i,n} + C + OPT(i-1) (p 265). In other words, the value of the optimal solution is the error derived from a line going from point i to point n, plus the cost of a line, plus the optimal solution for points 1 through i-1. We get the recurrence from 6.6. The algorithm, then, involves a memoization array M, set to all zeros initially. Then for all pairs i ⇐ j, we compute the least squares error e_{i,j} for the segment p_i, …, p_j. Then we use the recurrence to compute M[j]. Computing the least squares error is O(n^3) and computing M[j] is O(n^2). Thus, the algorithm is O(n^3) in total, or O(n^2) once the errors are calculated.
My Questions: Not a question, but I with the authors had not left it to the reader in an exercise to improve the runtime to O(n^2) in total.
Second Time Around: Now that we've talked about the knapsack problem in class, too, it's more clear to me the concept of building up an optimal solution when you do dynamic programming problems iteratively. What's the optimal solution with input 1, 1 and 2, 1 through 3, etc.
Note to Self: O(n^3) is still “polynomial”…but that doesn't mean I like it.
Readability: 7 - a little more math than the other sections in this chapter so far, but still not bad.

Table of Contents

Chapter 6 - Dynamic Programming

Section 6.1

Section 6.2

Section 6.3