This is an old revision of the document!
Table of Contents
Chapter 5
- Divide and Conquer: class of algorithmic techniques in which one breaks the input into several parts, solves the problem in each part recursively, and then combines the solutions to these subproblems into an overall solution
- Recurrence Relation: bounds the running time recursively in terms of the running time on smaller instances
- Divide and conquer strategy may reduce the running time to a lower polynomial from the brute-force polynomial time.
5.1 A First Recurrence: The Mergesort Algorithm
- Mergesort: sorts a given list of numbers by first dividing them into two equal halves, sorting each half separately by recursion, and then combining the results of these recursive calls using the linear time algorithm for merging sorted lists
- Base Case: when input has been reduced to size 2, T(n_ is equal to a constant when n is a constant.
- Stop the recursion and sort the two element
- For some constant c, T(n) ⇐ 2T(n/2) + cn when n >2, and T(2) ⇐ c
- To gain an explicit bound, we need to solve the recurrence relation so that T appears only on the left-hand side of the inequality, not the right-hand side as well.
Approaches to Solving Recurrences
- “unroll” the recursion, accounting for the running time across the first few levels, identifying a pattern that can be continues as the recursion expands. Sum up the running times over all levels → total running time
- Start with a guess, substitute in the recurrence relation, check if it works; Use an argument by induction on n to formally justify this approach
Unrolling the MergeSort Recurrence
- Analyzing the first few levels: single problem of size n
- level 0: takes at most cn plus the time spent in all subsequent recursive calls
- level 1: two problems of size n/2; each takes at most cn/2 time
- level 3: four problems of size n/4; each takes at most cn/4 time
- Identifying a pattern: level j: number of subproblems has doubled j times: 2^j; each ahs shrunk in size by a factor of two j times, and so each has size n/2^j; each level contributes at most 2^j(cn/2^j)=cn to the total running time.
- Summing over all levels of recursion: we've found that the recurrence in (5.1) has the property that the same upper bound of cn applies to total amount of work performed at each level. Number of levels: logn. Total running time = O(nlogn)
Substituting a Solution into the Mergesort Recurrence
- Belief: T(n) ⇐ cnlogn for all n>2
- want to check if this is true
- Plug it into the recurrence:
- for n=2: true, since cnlogn = 2c
- By induction: T(m) ⇐ cmlogm, for all values of m less than n, and we want to establish this for T(n).
- T(n/2) ⇐ c(n/2)log(n/2) → log(n/2) = logn -1
An Approach Using Partial Substitution
- One guesses the overall form of the solution without pinning down the exact values of all the constants and other parameters at the outset.
- Suppose we believe that T(n) = O(nlogn)
- First write T(n) ⇐ knlogbn for some constant k and base b
- Try out one level of the induction as follows
- T(n) ⇐ 2T(n/2) + cn ⇐ 2k(n/2)logb(n/2)+cn
- Chose 2 as the base to help with simplification to:
- T(n) = (knlogn) - kn + cn
- k must be at least as large as c, so:
- T(n) ⇐ (knlogn) - kn +cn ⇐ knlogn
Personal Thoughts
Using mergesort as the example was helpful, as I am pretty familiar with that algorithm at this point. We went over it in class which helped me follow along with the step-by-step process of coming up with an appropriate recurrence relation. Still though, this material is a little difficult for me and I know I'll need more practice with its application before I really understand it.
Readability: 6.0 Interesting: 6.0
5.2 Further Recurrence Relations
- Divide-and-conquer algorithms that create recursive calls on q subproblems of size n/2 each and then combine in O(n) time
- For some constant c, T(n) ⇐ qT(n/2) + cn when n>2, and T(2)⇐ c.
The Case of q>2 Subproblems
- Unrolling in the case q>2:
- Analyzing the first few levels: show an example of this for the case q=3; first level of recursion, single problem of size n takes at most cn plus the time spent in subsequent recursive calls; next level, q problems, each of size n/2, each takes at most cn/2 time for a total of at most (q/2)cn, again plus the time in subsequent recursive calls…
- Identifying a pattern: at level j, there are q^j distinct instances, each of size n/2^j. Thus at level j = q^j(cn/2^j) = (q/2)^j * cn.
- Summing over all levels of recursion: logn levels of recursion → geometric sum. Any function T(.) satisfying “for some constant c, T(n) ⇐ qT(n/2) + cn when n>2, and T(2)⇐ c” with q>2 is bounded by O(n^logq)
- Applying Partial Substitution:
- q>2 has the form T(n) ⇐ kn^d for k>0 and d>1.
- T(n) <qT(n/2)+cn
- T(n/2) = (q/2^d)kn^d + cn
- choose d so that q/2^d = 1 → d=logq
- get rid of the cn term → change the form of our guess for T(n) so as to explicitly subtract it off
- Handle base case and choose k: choose k large enough so that the formula is a valid upper bound for the case n=2
The Case of One Subproblem
- Consider case of q=1
- Unrolling the recurrence:
- Analyzing the first few levels: first-single problem of size n that takes at most cn time, next-one problem of size n/2 which contributes to cn/2, next-one problem of size n/4 which contributes to cn/4
- Identifying a pattern: at level j, size n/2^j and contributes to a cn/2^j running time
- Summing over all levels of recursion: There are logn levels of recursion and the total amount of work performed: T(n) ⇐ 2cn = O(n)
- Any function T(.) satisfying for some constant c, T(n) ⇐ qT(n/2) + cn when n>2, and T(2)⇐ c“ with q=1 is bounded by O(n)
- Geometric series with a decaying exponent: fully half the work performed by the algorithm is being done at the top level of the recursion
- The Effect of Parameter q: when q=1, the resulting running time is linear; when q=2, it's O(nlogn); when q>2, it's a polynomial bound with an exponent larger than 1 that grows with q
- When q=1, the total running tie is dominated by the top level, whereas when q>2, it's dominated by the work done on constant-size subproblems at the bottom of the recursion
A Related Recurrence: T(n) ⇐2T(n/2)+O(n^2)
- Divide the input into two pieces of equal size, solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending quadratic time for the initial division and final recombining.
- For some constant c, T(n) ⇐2T(n/2)+cn^2 when n>2 and T(2) ⇐c
- first reaction is to guess that the solution will be T(n)=O(n^2logn)
- true but we can also show a stronger upper bound
- Unrolling:
- Analyzing the first few levels: first-single problem of size n that takes at most cn^2 time plus the time spent in all subsequent recursive calls, next- we have two problems, each of size 2/2. Each takes at most c(n/2)^2 for a total of at most cn^2/2
- Identifying a pattern: at level j, there are j^j subproblems each of size n/2^j → cn^2 / 2^j
- Summing over all levels of recusion: we've arrived at almost the same sum that we had for q=1 in the previous recurrence: O(n^2)
- Initial guess overestimated becasue of how quickly n^2 decreases as we replace it.
