Chapter 5.3: Counting Inversions
Consider comparing two rankings of the same set of n movies. A natural way to quantify this notion is by counting the number of inversions. We say that two indices i < j form an inversion if a(i) > a(j), that is, if the two elements a(i) and a(j) are “out of order.”
Designing/Analyzing the Algorithm
We can count the number of inversions in O(nlogn) time since there can be a quadratic number of inversions and therefore there must be such an algorithm to compute the total number without ever looking at each inversion individually. We split the list into two pieces and first count the number of inversions of each of these two halves. Then we count the number of inversions (a(i), a(j)), where the two numbers belong to different halves.
Now the hard part. Merge-and-Count is the process we use once we have recursively sorted the first and second halves of the list and counted the inversions in each. We now have two sorted lists A and B, and we want to produce a single sorted list C from their union, while also counting the number of pairs (a,b) with a in A, b in B, and a > b. Merging is very similar to mergesort, however we also need to count the inversions. Every time the element a(i) is appended to C, no new inversions are encountered, since a(i) is smaller than everything left in list B, and it comes before all of them . On the other hand, if b(j) is appended to list C, then it is smaller than all the remaining items in A, and it comes after all of them, so we increase our count of the number of inversions by the number of elements remaining in A.
Here is the algorithm in full:
Merge-and-Count(A,B) Maintain a //Current// pointer into each list, initialized to point to the front elements Maintain a variable //Count// for the number of inversions = 0 While both lists are nonempty: Let a(i) and j(j) be the elements pointed to by the //Current// pointer Append the smaller of these two two the output list If b(j) is the smaller element then Increment //Count// by the number of elements remaining in A Advance the //Current// pointer in the list from which the smaller element was selected Once one list is empty, append the remainder of the other list to the output Return //Count// and the merged list
This runtime is O(n).
The whole algorithm is as follows:
Sort-and-Count(L) If the list has one element then there are no inversions Else Divide the list into two halves: A contains the first [n/2] elements B contains the remaining [n/2] elements (r(a), A) = Sort-and-Count(A) (r(b), B) = Sort-and-Count(B) (r, L) = Merge-and-Count(A,B) Return r = r(a) + r(b) + r, and the sorted list L
Since merge-and-count takes O(n) the running time of the full Sort-and-Count takes O(nlogn) due to its sorting.
This is a very easily understood chapter and a very useful algorithm. 10/10.