Chapter 6.3: Segmented Least Squares

In previous problems, we've developed recurrence relations based on a binary choice, either n belonged to the optimal solution or it didn't. This problem, however, involves multiway choices; i.e. at each step, we have a polynomial number of possibilities to consider for the structure of the optimal solution.

A good example problem is one where we try to find the line(s) of best fit for a set of points. We are given a set of points P={(x_1,y_1), (x_2,y_2),…, (x_n,y_n)} with x_1<x_2<…<x_n. We will use p_i to denote the point (x_i,y_i). We must first partition P into some number of segments. The penalty of a partition is the sum of the following terms:

  1. The number of segments into which we partition P, times a fixed, given multiplier C>0
  2. For each segment, the error value of the optimal line through that segment

Our goal is to find a partition of minimum penalty.

Algorithm

In this problem, it is important to remember that the last point p_n belongs to a single segment in the optimal partition, and that segment begins at some earlier point p_i. Knowing that the last segment is p_i,…,p_n, we can remove those points from consideration and solve the problem on the remaining points p_1,…p_i-1. So the algorithm looks like this:

Segmented-Least-Squares(n):

(end for)

Find-Segments(j):

Computing all e_i,j values is O(n^3) running time, and the remaining problem runs in O(n^2) time.

I would rate this section a 7. I understood it, but I feel like they could have done a better job setting up the problem. I think the motivation for the problem was explained better in class than in this section.