Table of Contents
Overview
definition:P234 Divide and conquer refers to a class of algorithmic techniques in which one breaks the input into several parts, solves the problem ifi each part recursively, and then combines the solutions to these subproblems into an overall solution
important concept
recurrence relation that bounds the running time recursively in terms of the running time on smaller instance.
5.1 A First Recurrence: The Mergesort Algorithm
Summary Divide-and-conquer process
- Break up problem into several parts
- Solve each part recursively
- Combine solutions to sub-problems into overall solution
Mergesort recurrence relation: T(n) < 2T(n/2) + cn
Solving Recurrences: two methods
unroll the recurrence:
- Look for patterns in runtime at each level
- Sum up running times over all levels
Substitute guess solution into recurrence:
* Check that it works
- Induction on n
interesting/readable: 8/8
5.2 Further Recurrence Relations
summary
A more general class of algorithms is obtained by considering divideand- conquer algorithms that create recursive calls on q subproblems of size n/2 each and then combine the results in O(n) time.
recurrence relation: T(n) < qT(n/2) + cn
then unroll the recurrence:
At an arbitrary levelj, we have qJ distinct instances, each of size n/2]. Thus the total work performed at level j is qJ(cn/2^j) = cn*(q/2)^j.
More specifically, cn/2^i is the size of the subproblem, and we have for each level q subproblems so q^j problems. I think of this as a j-level nested for loops.
T(n) ≤ Σj=0,logn cn*(q/2)^j [cn*(q/2)^logn-cn]/(logn-1) = cn*[(q/2)^logn-1]/(logn-1)
another handy identity for any a > 1 and b > 1, we have a^logb = b^loga
therefore, we have (cn/logn-1)*[n^log(q/2)-1] which is some constant k for the first part k*n^log2q
aha.
Use recurrences to analyze the run time of divide and conquer algorithms
- Number of sub problems
- Size of sub problems
- Number of times divided (number of levels)
- Cost of merging problems
interesting/readable: 9/9
5.3 Counting Inversions
summary: The Problem Movie site tries to match your movie preferences with others You rank n movies Movie site consults database to find people with similar tastes
ollaborative filtering
We need a way to Comparing Rankings: need a similarity metric: how to measure how similar two peopl's preferences are?
specifically, count the number of inversions between two rankings kind of reminds of the earth movers distance in the research, which is the amount of work to shift a histogram into another.\
Brute Force Solution Look at every pair (i,j) and determine if they are an inversion obviously runs in N^2
D/C algo:
• Divide: separate list into two pieces • Conquer: recursively count inversions in each half • Combine: count inversions where ai and aj are in different halves, and return sum of three quantities
Combine: count blue-green inversions Assume each half is sorted Count inversions where ai and aj are in different halves Merge two sorted halves into sorted whole
Recurrence Relation: T(n) ≤ T(n/2) + T(n/2) + O(n) T(n) –> nlogn
implementation:
Sort-and-Count(L) if list L has one element return 0 and the list L Divide the list into two halves A and B (iA, A) = Sort-and-Count(A) (iB, B) = Sort-and-Count(B) (i, L) = Merge-and-Count(A, B) return i = iA + iB + i and the sorted list L
interesting/readable: 7/7
5.4 Finding the closest Pair of Points
Closest pair. Given n points in the plane, find a pair with smallest Euclidean distance between them. Brute force N^2
simplified case: if all points are on one line, then we only need to compare in one dimension. so sort the pioints in either dimension: nlogn iterate through it and find the closest pair: n total nlogn
D/C:
Divide: draw vertical line L so that roughly ½n points on each side Combine: find closest pair with one point in each side
better combine: Find closest pair with one point in each side, assuming that distance < δ where δ = min(left_min_dist, right_min_dist)
we only need to consider points within the strip enclosed by δ.
Sort remaining points by y-coordinate. Scan points in y-order and compare distance between each point and next 7 neighbors. If any of these distances is less than δ, update δ.
O(nlog2n) logn levels and every level is nlogn if we sort every time
could go down to nlogn if we sort only once and return the sorted lists each time.
interesting/readbale: 7/7
5.5 Integer multiplication
summary: Problem: the multiplication of two integers If we use brute force, it is an N2 algorithm. We want to do better than that. Additions and subtractions are cheaper, substituting those for multiplications and divisions can sometimes help the complexity. We do this by doing divide and conquer.
Attempt 1 : represent the x and y in binary form, multiplication produces 4 instances 4 way branching T(n)= 4T(n/2) + cn does not help. ended up with O(N2)
if we could get q down tp 3, we have some savings.
Matrix multiplications: Divide and conquer in MM: devide into submatrices and combine the results decimal wars:
Best known. O(n2.376) [Coppersmith-Winograd, 1987] But really large constant • Conjecture. O(n2+ε) for any ε > 0. • Caveat. Theoretical improvements to Strassen are progressively less practical.
Interesting/reabale: 9/9