This is an old revision of the document!

6.1 Weighted Interval Scheduling

Summary

Designing a Recursive Algorithm

No natural greedy algorithm is known for this problem
n requests labeled 1…n with each request i specifying a start time si and finish time fi; each interval i also has a value (or weight) vi; two intervals are compatible if they don’t overlap
goal: select a subset of mutually compatible intervals to maximize the sum of the values of the selected intervals
p(j): for an interval j, it is the largest index i<j such that intervals i and j are disjoint (i is the leftmost interval that ends before j begins
optimal solution Oj: either j is in Oj, in which case OPT(j) = vj + OPT(p(j)), or j is not in Oj in which case OPT(j) = OPT(j-1)
Therefore, OPT(j) = max(vj+OPT(p(j)), OPT(j-1))
n belongs to the optimal solution if and only if the first of the options above is at least as good as the second
Request j belongs to an optimal solution on the set {1, 2, …, j} if and only if vj+OPT(p(j)) >= OPT(j-1)
Recursive algorithm to compute OPT(n) assuming we have already sorted the requests by finish time and computed the cvalues of p(j) for each j

Compute Opt Algorithm

   Compute-Opt(j):
     If j=0:
	     Return 0
     Else:
	     Return max(vj+OPT(p(j)), OPT(j-1))
     Endif

Compute-OPT(j) correctly computes OPT(j) for each j = 1, 2, …, n
The above algorithm for Compute-Opt as written would take exponential time in the worst case

Memoizing the Recursion

The fact that the algorithm as written take exponential time is due to the redundancy in the number of times it issues each of these calls
Solution: store the value of Compute-Opt in a globally accessible place the first time we compute it and then simply use this precomputed value in place of all future recursive calls
Memorization: the technique of saving values that have already been computed
Memorized Compute-Opt would make use of array M[0…n]; M[j] will start with the value “empty” but will hold the value of Compute-Opt(j) as soon as it is first determined

Compute Opt with Memoization

   M-Compute-Opt(j):
     If j=0:
	     Return 0
     Elif M[j] is not empty:
	     Return M[j]
     Else
	     Define M[j] = max(vj+OPT(p(j)), OPT(j-1))
	     Return M[j]
     Endif

Analyzing the Memorized Version

The runtime of M-Compute-Opt(n) is O(n) (assuming the input intervals are sorted by their finish time.
- Since M has only n+1 entries, it follows that there can be at most O(n) calls to M-Compute-Opt.

Computing a Solution in Addition to Its Value

• Extend memorized solution to keep track of an optimal solution in addition to its value: we could maintain an additional array S so that S[i] contains an optimal set of intervals among {1, 2, …, i}

Avoid runtime blow-up by not explicitly computing S but rather by recovering the optimal solution from values saved in the array M after the optimum value has been computed

Retroactively Computing Optimal Solution

   Find-Solution(j):
     If j=0:
	     Output nothing
     Else:	
	     If vj+M[p(j)] >= M[j-1]:
		     Output j together with the result of Find-Solution(p(j))
	     Else:
		     Output the result of Find-Solution(j-1)
	     Endif
     Endif

Total of O(n) recursive calls
Given the array M of the optimal values of the sub-problems Find-Solution returns an optimal solution in O(n) time

Additional Information

On a scale of 1 to 10, the readability of this section was an 8. The authors did a really good job of explaining memoization and why it is necessary for this problem. They also did a great job introducing dynamic programming because they started with a problem that is semi-familiar to us.

6.2 Principles of Dynamic Programming

Summary

Designing the Algorithm

Basic principles of dynamic programming: iterating over subproblems rather than computing solutions recursively
Can directly compute the entries in M by iterative algorithm rather than meoized recursion

Weighted Intervals Iterative Solution

   Iterative-Compute-Opt:
     M[0] = 0
     For j = 1, 2, …, n:
	     M[j] = max(vj+M[p(j)], M[j-1])
     Endfor

Analyzing the Algorithm

Iterative-Compute-Opt is O(n) since it explicitly runs for n iterations and spends constant time in each

A Basic Outline of Dynamic Programming

To set about developing an algorithm based on dynamic programming, one needs a collection of subproblems derived from the original problem that satisfies a few basic properties
- Polynomial number of subproblems
- Solution to original problem easily computed from the solutions to subproblems
- Natural ordering on subprobelms from “smallest” to “largest” together with an easy-to-compute recurrence that allows on the determine the solution to a subproblem from the solutions to some number of smaller subproblems
- Never clear that a collection of subproblems will be useful until one finds a recurrence linking them together; but it can be difficult to think about recurrences in the absence of the “smallest” subproblems that they build on

Additional Information

On a scale of 1 to 10, the readability of this section was a 9. This was pretty much an informational section. We merely learned an iterative version of the solution to the problem discussed in the section before, which uses loops so it is easy to comprehend. Also, the rest of the section was just explaining the principles of dynamic programming, so it was heavily definition based.

6.3 Segmented Least Squares

Summary

The Problem

Suppose our data consists of a set P of n points in the plane denoted (x1, y1) … (xn, yn); suppose x1<x2<…<xn
Given a line L defined by the equation ax+b, we say that the error of L with respect to P is the sum of its squared “distances” to the points in P
A natural goal is to find the line with minimum error
Rather than seeking a single line of best fit, we are allowed to pass an arbitrary set of lines through the points and seek a set of lines that minimizes the error
Need a problem formulation that requires us to fit the points well, using as few lines as possible
This is called the Segmented Least Squares Problem
Change detection: given a sequence of data points, we want to identify a few points in the sequence at which a discrete change occurs
Must first partition P into some number of segments (a segment is a subset of P that represent a contiguous set of x-coordinates)
For each segment S in P, we compute the line minimizing the error with respect to the points in S
Penalty of a partition:
- Number of segments into which we partition P, times a fixed, given multiplier c>0
- For each segment, the error value of the optimal line through that segment
Goal: find partition of minimum penalty
Exponentially many possible partitions of P
Dynamic programming used to find a partition of minimum penalty in time polynomial in n

Designing the Algorithm

Last point pn belongs to a single segment in the optimal partition and that segment begins at some earlier point pi
If we knew the identity of the last segment pi, …, pn then we could remove those points from consideration and recursively solve the problem on the remaining points pi, …, pi-1
OPT(i) denotes the optimum solution for the points p1,…pn and ei,j denotes that minimum error of any line eith respect to pi, pi+1, …, pj
If the last segment of the optimal partition is pi,…,pn then the value of the optimal solution is OPT(n) = ei,n+c+OPT(i-1)
For the subprolem on the points p1,..,pj, OPT(j) = min(ei,j + C +OPT(i-1)) and the segment pi,…,pn is used in an optimal solution for the subproblem if and only if the minimum is obtained using index i

Segmented Least Squares Algorithm

   Segmented-Least-Squares(n):
     Array M[0..n]
     Set M[0] = 0
     For all pairs i<=j:
	     Compute the least squares error ei,j for the segment pi,…,pj
     Endfor
     For j=1, 2, …, n:
	     Use the recurrence , OPT(j) = min(ei,j + C +OPT(i-1)) to compute M[j]
     Endfor
     Return M[n]

Trace back through array M to find the optimum partition

Retroactively Compute the Optimal Solution

   Find-Sequence(j):
     If j=0:
	     Output nothing
     Else:
	     Find an i that minimizes ei,j + c + M[i-1]
	     Output the segment {pi,…,pj} and the result of Find-Segment(i-1)
     Endif

Analyzing the Algorithm

Segmented-Least-Squares: O(n^2) pairs (i, j) for which this computation is needed, for each pair (i, j) we can compute ei,j in O(n) time
- Total runtime to compute all ei,j values is O(n^3) but can is possible to be done in O(n^2) time
The algorithm has n iterations the determine the minimum of the recurrence OPT(j) = min(ei,j + C +OPT(i-1)); this takes O(n) time
- Total runtime of O(n^2) for this step
Therefore, algorithm has overall runtime of O(n^2)

Additional Information

One part of this section that made more sense after going to class was the actual formulation of the problem. This section had pretty much two pages worth of information on how we construct the exact problem that we are going to use. However, the slides in class made it short and sweet. We want to minimize the following equation E + cL. Where E is the error, c is the cost constant, and L is the number of lines.

On a scale of 1 to 10, the readability of this section was a 7. Some of the explanations seemed a little circular and long-winded. I preferred the layout of the problem to be short and sweet like the slides did in class. Besides that though, the actual algorithms were fairly easy to comprehend and analyze for runtime.

Table of Contents

6.1 Weighted Interval Scheduling

Summary

Designing a Recursive Algorithm

Memoizing the Recursion

Analyzing the Memorized Version

Computing a Solution in Addition to Its Value

Additional Information

6.2 Principles of Dynamic Programming

Summary

Designing the Algorithm

Analyzing the Algorithm

A Basic Outline of Dynamic Programming

Additional Information

6.3 Segmented Least Squares

Summary

The Problem

Designing the Algorithm

Analyzing the Algorithm

Additional Information