Student Journals

This is an old revision of the document!

Section 6.4 - Subset Sums and Knapsacks: Adding a Variable In this section we look at the Knapsack problem where there are n items and each is given a weight wi, and each request i has a value vi and weight wi. Thus the problem is to make the knapsack as full as possible (with max capcity of W). This problem is similar to the Weighted-Interval Problem because as in W-I-P, the key to the problem is that the item is either included in the sack or not. For this problem, we don't include item n when the nth term is too big (W<wn), so OPT(n,W) = OPT(n-1, W). But if we do include n, then OPT(n,W) = wn + OPT(n-1, W - wn), where W - wn is the new W. So as before, the algorithm is designed to build up a table of all OPT(i,w) values while computing each of them only once. Therefore, after building up a table of solutions M, and each of the values M[i,w] have been computed in O(1) time by using the previous values the algorithm Subset-Sum(n, W) runs in O(nW) time. This runtime is known as pseudo-polynomial. To recover an optimal set S of items, an algorithm can trace back through the array M by a procedure similar to those we devloped in the previous secitons: Given a table M of the optimal values of the subproblems, the optimal set S can be found in O(n) time.

Section 6.5 - RNA Secondary Structure: Dynamic Programming over Intervals In this section the book looks at the problem of RNA secondary structure prediction. In this problem, a single stranded RNA molecule where the molecule can be viewed as a sequence of n symbols (bases) drawn form the alphabet {A, C, G, U}. Let B = b1b2…bn be a single stranded RNA molecule, where each bi is an element of the alphabet. Thus, we say that a secondary structure on B is a set of pairs S = {(i, j)}, where i, j are elements of {1, 2, …., n}, that satisfies the following conditions: 1. the ends of each pair in S are separated by at least four intervening bases; that is, if (i, j) E S, then i<j-4, 2. the elements of any pair in S consist of either {A, U} or {C, G}, 3. S is a matching: no base appears in more than one pair, and 4. if (i, j) and (k, l) are two pairs in S, then we cannot have i < k< j< l. For design purposes we describe OPT(i,j) denotes the maximum number of base pairs in a secondary structure on bibi+1…bj, and we know that OPT(i, j) = 0 whenever i≥j-4. Now we knot that in the optimal secondary structure on bibi+1….bj, either j is not involved in a pair; or j pairs with t for some t<j-4. If we rearrange this we can say that OPT(i, j) = max(OPT(i, j-1), max(1+OPT(i, t-1) + OPT(t+1, j-1))), where the max is taken over t such that bt and bj are an allowable base pair. Once we knwo this, we can describe the rest of the algorithm as: 1- initialize OPT(i,j) = 0 where i≥j-4. Next, go into a for loop of k=5…n-1, and then go into a nested for loop for i = 1….n-k. Within the two for loops the algorithm sets j = i+k and computes OPT(i,j). Once it has done this for all of n then it returns OPT(i, n). It is easy to bound the running time of this problem since there are O(n-squared) subproblems to solve, and evaluating the recurrence of OPT(i,j) takes time O(n) for each. THus the running time is O(n-cubed).

Section 6.6 - Sequence Alignment In this section we look at the problem of how one should define similarity between two strings. To answer this problem we first need two strings to “line up”.