Table of Contents
Entry Seven
Chapter 5.2
Further Recurrence Relations
This section focuses on solving recurrence relations more generally. We look at divide and conquer algorithms that have q recursive calls on subproblems of size n/2. This recurrence relation is represented as T(n) =< qT(n/2) + cn.
When q > 2…
- level one: has problem size n
- level two: has q problems of size n/2 that take at most cn/2 time to a total of (q/2)cn
- next level: is n2 of size n/4 to a total time of (q2/4)n
- The pattern: each level has qj problems with size n/2j with a total work at each level = (q/2)jcn.
- there are log2n levels of recursion
- total work per level is increasing
- to sum all of the work at these levels is a geometric sum with powers r=q/2
- we apply a formula for a geometric sum
(5.4) Any function with the form T(n) =< qT(n/2) + cn is bounded by O(nlog2q) when q > 2.
When q = 1…
- level one: single problem size n takes cn time plus time in subsequent recursive calls
- level two: problem size n/2 with cn/2 time
- next level: problem size n/4 with cn/4 time
- total work per level is decreasing
- the pattern: each level has size n/2j with running time cn/2j
- there are log2n levels of recursion
- the geometric sum
(5.5) Any function with the form T(n) =< qT(n/2) + cn is bounded by O(n) when q = 1.
Another Recurrence: T(n) =< 2T(n/2) + O(n2)
Chapter 5.3
Counting Inversions
In this section we apply some divide and conquer algorithms to solve the problem of counting inversions. It is similar to merge sort. This problem is about matching preferences etc. for a website like Netflix that compares your rankings and another person's rankings of a set of n movies. We order our ranking in order 1 to n and the other persons ranking according to our order and see how many are inverted. A complete agreement is when the rankings are in ascending order. A complete disagreement is when the rankings are in descending order.
The Algorithm:
- if we looked at every pair of number to see if they were inverted would take O(n2 time
- our goal is O(n log n)
- divide the set of numbers into two pieces of size n/2 and count inversions in each half separate then count the inversions between the halves
- to count which numbers belong in the other half we recursively sort the halves
- after recursively sorting and counting inversions to each half apply Merge-and-Count
- we should produce a single sorted list of the union of the two halves
- also should produce the number of inversions between the two lists
- have 2 pointers that point to the current element in each list
- idea is to append the smallest element to the union list, and if the 2nd half at the pointer is smaller than the first half at that current pointer there is an inversion
Merge-and-Count(A,B) Maintain a Current pointer into each list, initialized to point to the front elements Maintain a variable Count for the number of inversions, initialized to 0 While both lists are nonempty: Let ai and bj be the elements pointed to by the current pointer append the smaller of these two to the output list if bj is the smaller element then increment count by number of elements remaining in A endif advance the current pointer in the list from which the smaller element was selected Endwhile Once one list is empty, append the remainder of the other list to the output return count and the merged list
Analysis:
Everything happening in the while loop is constant O(1) time. The while loop executes at most n times so the run time of Merge-and-Count is O(n).
The merge and count function is used in another recursive algorithm sort-and-count
Sort-and-Count(L) If the list has one element then there are no inversions Else Divide the list into two halves: A contains the first [n/2] elements B contains the remaining [n/2] elements (inversionsA, A) = Sort-and-Count(A) (inversionsB, B) = Sort-and-Count(B) (totalInversions, L) = Merge-and-Count(A,B) Endif Return totalInversions = inversionsA + inversionsB + totalInversions, and the sorted list L
The recurrence relation is T(n) =< 2T(n/2) + cn. This is O(n log n) time.
This section was very short and easy to read. No problems here. The pictures showed on the slides in class were especially helpful in understanding this section. Readability: 9/10
Chapter 5.4
Finding the Closest Pair of Points
The Problem:
We are given a plane with a set of n points. Our goal is to find the pair of points that is closest together with a runtime of O(n log n).
The Algorithm:
We are given the set of points P where each point pi has coordinates (xi, yi) and d(pi, pj) is the distance between two points i and j. Our first step is to divide the points into two halves and find the smallest distance in both halves. Then we “combine” the halves and find the pair of points with the smallest distance with one point in one half and the other point in the other half. Recursion!!! We sort the points by their x coordinates into one list and again by their y coordinates into another list. Q is the set of points in the left half of the plane and R is the points in the right half of the plane. For each half we sort the points into two different lists by their x and y coordinates again which takes O(n) time. We then find the closest pair of points in Q and the closest pair of points in R. Now merging the solutions is the hard part. If there is no pair of points (q, r) whose distance is less than the minimum distance of Q and R then we have found our solution. The minimum distance of Q and R is denoted as z. If there is a pair of points (q, r) with a smaller distance, than those points must be within z of L (the line that separates Q and R). This allows us to minimize the set of points we are looking at.
(5.10) If s, s' in S have the property that d(s, s') < z, then s and s' are within 15 positions of each other in the sorted list Sy.
Proof: any two points in the same box are within the distance z*(sqrt2/2) < z which is a contradiction that z is the min distance between any pair in Q or R so each box has only one point in S.
Analysis:
Proof by induction that the algorithm correctly outputs a closest pair of points in P. The runtime of the algorithm is O(n log n). Sorting the points by their x and y coordinates takes O(n log n) time and the recurrence is T(n) =< 2T(n/2)+cn is O(n log n), so the whole algorithm is bounded by n log n.
This section got really confusing when we were considering the subset S with all of the boxes and the distances. I'm still kind of confused by why the points have to be within 15 positions of each other. Overall it was readability: 8/10