Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2014:journals:emily:entryeight [2014/03/25 20:19] – [Chapter 6.2] hardyecourses:cs211:winter2014:journals:emily:entryeight [2014/03/26 03:04] (current) – [Chapter 6.4] hardye
Line 108: Line 108:
 ====== Chapter 6.3 ====== ====== Chapter 6.3 ======
  
 +**Segmented Least Squares: Multi-way Choices**
 +
 +In the previous section at each step the subproblem belonged to the solution or it didn't. In this section the recurrence involves "multi-way choices". This means at each step there are polynomial possibilities to consider for the structure of the optimal solution.
 +
 +The problem we are trying to solve is finding the line that best fits data plotted on x and y axes. The data we are given is a set of points (x, y) that are in ascending order of x. We assume each point has a unique x value. The error of a line L, with respect to the set of points P is the sum of its squared distances to the points in P. We want to find the line with minimum error, but instead of using just one line, we seek a set of lines through the points that minimized the error using as few lines as possible. This is known as the segmented least squares problem. We have to detect change in the points; identify where the change occurs from one line to another.
 +
 +We partition the set of points P into subsets that has a segment S. For each subset we compute the line of minimum error. The penalty for each subset is: the number of segments we partition P * a constant C > 0 and the error value of the optimal line through each segment. We want to find a partition of minimum penalty. 
 +
 +**Designing the Algorithm**
 +
 +We are seeking to partition n objects. We want a polynomial number of subproblems that yield a solution and we should build up these solutions using recurrence.
 +
 +The past point p<sub>n</sub> belongs to a segment in the optimal partition that begins at some point before p<sub>n</sub> at p<sub>i</sub>. Let opt(i) equal the optimum solution for points p<sub>1</sub> through p<sub>i</sub>. e<sub>i,j</sub> represents the minimum error of any line from points p<sub>i</sub> to p<sub>j</sub>
 +
 +**6.1** if the last segment of the optimal partition is p<sub>i</sub>,...,p<sub>n</sub> , then the value of the optimal solution is opt(n) = e<sub>i,n</sub> + C + opt(i-1).
 +
 +To find the best way to produce a segment we find the minimum 
 +
 +**6.7** for the subproblem on the points p<sub>1</sub>,...,p<sub>j</sub>, opt(j) = min(1=<i=<j)(e<sub>i,j</sub> + C + opt(i-1).
 +
 +    Segmented-Least-Squares(n)
 +      Array M[0...n]
 +      Set M[0] = 0
 +      For all pairs i =< j
 +        compute the least squares error e(i,j) for the segment p(i)...p(j)
 +      Endfor
 +      For j = 1,2,...n
 +        M[j] = min(1=<i=<j)(e(i,j) + C + opt(i-1)
 +      Endfor
 +      Return M[n]
 +      
 +This algorithms correctness is proved by induction. To find the points that are in a segment and not the value that is stored in the array, we use an algorithm like in the previous section
 +
 +    Find-Segments(j)
 +      If j=0 then
 +        Output nothing
 +      Else
 +        Find an i that minimizes e(i,j) + C + M[i-1] 
 +        Output the segment p(i)...p(j) and the result of Find-Segments(i-1)
 +      Endif
 +
 +**Analysis:**
 +
 +We perform the value of the least squares errors for n<sup>2</sup> pairs that we compute at O(n) times, so the runtime of computing e<sub>i,j</sub> values is O(n<sup>3</sup>). Once we have all of the error values, for each value we have to determine the minimum recurrence that takes O(n) for each j, so this is O(n<sup>2</sup>). 
 +
 +I thought this section was harder to read than the previous two because it was a lot more generalized and assumed you knew more about dynamic programming. I thought this problem was pretty interesting though because I like statistics! Readability: 7/10.
 ====== Chapter 6.4 ====== ====== Chapter 6.4 ======
    
 +**Subset Sums and Knapsacks: Adding a Variable**
 +
 +In this section we solve a problem that has duration and deadline but not have an interval by which they need to be done. The problem we are considering is a machine that processes a set of requests 1 through n where we can only use the machine from 0 to W and each job has a process time w<sub>i</sub>. We want to choose the jobs that use the machine as much as possible up to the time W. We can make this problem more general into the knapsack problem where each request has a value and a weight and we want to chose a subset of maximum total value that does not exceed the weight W of the knapsack. We cannot use a greedy approach to solve this problem! 
 +
 +**The Algorithm:**
 +
 +One way we could try the algorithm is to consider the first i requests but this does not work. We need more subproblems. To find opt(n) we need the value of opt(n-1) and the best solution of subset n-1 and total allowed weight W-w<sub>n</sub>. Opt(i,w) will be the value of the optimal solution of a subset 1 through i with maximum allowed weight w. If n is not in the solution then opt(n,W) = opt(n-1,W). If n is in the solution then opt(n,W) = w<sub>n</sub> + opt(n-1, W-w<sub>n</sub>). When W is less than w<sub>n</sub> then n cannot be in the solution. 
 +
 +    Subset-Sum(n,W)
 +      Array M[0..n, 0..w]
 +      initialize M[0,w] = 0 for each w = 0, 1,...W
 +      for i = 1,2,...,n
 +        for w = 0,...W
 +          M[i,w] = opt(i,w) = max(opt(i-1, w),w(i) + opt(i-1,w-w(i))
 +        Endfor
 +      Endfor
 +      Return M[n, W]
 +      
 +We prove the algorithms correctness by induction. Instead of a single array, we use a matrix of subproblems to build up the values. We compute each of the values in M in constant time so the runtime is the number of entries in M which is O(nW). It is a polynomial function of n and W, not just n. This is //pseudo-polynomial//. To get the set of items and not just the value we trace back through the array as we did in previous problems. This can be found in O(n) time. 
 +
 +This section was really interesting to me and seems like something really practical and would be used a lot. I thought like the previous section it cut out a lot of the in depth descriptions but it was a little more easy to read because I understood it better with more experience. Readability 8/10
courses/cs211/winter2014/journals/emily/entryeight.1395778764.txt.gz · Last modified: 2014/03/25 20:19 by hardye
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0