Mergesort sorts a list by dividing it in half, sorting each half separately using recursion, and combining the two sorted lists. This is a template for many divide-and-conquer algorithms:
"Divide the input into two pieces of equal size; solve the two subproblems on these pieces separately by recursion; and then combine the two results into an overall solution, spending only linear time for the initial division and final recombining."
These types of algorithms need a base case. We then need to denote T(n) as the worst-case running time for a given problem of this type. We then break this running time down as the algorithm divides and conquers and we end up with a recurrence relation. There are two basic ways to solve a recurrence: 1.) “unroll” the recursion; and 2.) guess and check.
Looking at Mergesort as an example, we can see how the process of unrolling takes place: we must identify a pattern in the running time across the first few levels, sum the running times over all levels, and arrive at a total running time. We can see that the first level of Mergesort has a problem of size n and takes at most cn time plus all the time spent in remaining recursive calls, the next level has two problems of size n/2, taking at most cn/2, two times, so cn, etc. To identify a pattern, we see than at level j, there are now cn/2j problems 2j times, which is simply cn. We then sum over all levels of recursion and see it takes log2n times to half the input to bring it from size n to size 2 (our base case). Since we are doing cn work at every level, the total running time is O(nlogn).
This section is a little bit challenging to understand for me. I have a hard time with the concept of calculating the recurrence relation. I guess more practice might help. I would rate my understanding at a solid 6 right now.
We will now look at problems that create recursive calls on pieces of the input of size n/2 and then sum the results in linear time, O(n). If T(n) denotes the running time of an algorithm following this pattern, its recurrence relation is T(n) ≤ qT(n/2) + cn, for some constant c. We will look at the case when there are q > 2 subproblems. After unrolling the recurrence relation, so analyzing the first few levels, finding a pattern, and summing the results, we can see that any function T(•) which satisfies the pattern stated previously with q > 2 is bounded by O(nlog2(q)). If q = 1, rather than q > 2, the function will instead be bounded by O(n).
Another recurrence relation is based on the following structure:
"Divide the input into two pieces of equal size; solve the two subproblems separately by recursion; and them combine the two results into an overall solution, spending quadratic time for the initial division and final recombining."
This algorithm has the following recurrence: T(n) ≤ 2T(n/2) + cn2. The running time in this case is O(n2).
We can apply the divide-and-conquer method to many problems, including counting the number of inversions, which compares rankings and looks at those which are “out of order”. Given a list of numbers A, we say that two indices i and j, where i < j, form an inversion if ai and aj are out of order (ai > aj). The Merge-and-Count Algorithm can be used to find the number of inversions given a list which has been divided in half and sorted.Here is the algorithm:
Merge-and-Count(A, B)
Maintain a Current pointer into each list, initialized to point to the front elements
Maintain a variable Count for the number of inversions, initialized to 0
While both lists nonempty:
Let ai and bj be the elements pointed to by the Current pointer
Append the smaller of these two to the output list
If bj is the smaller element then
Increment Count by the number of elements remaining in A
Endif
Advance the Current pointer in the list from which the smaller element was selected
Endwhile
Once one list is empty, append the remainder of the other list to the output
Return Count and the merged list
Merge-and-Count takes O(n) time. I understand this section very well since we went over it in class very thoroughly. It helps to run through an interactive example. I would rate my understanding an 8.