Table of Contents
Chapter 2: Basics of Algorithm Analysis
2.1: Computational Tractability
When considering computational algorithms, two crucial aspects are the algorithm's usage of physical memory space, and its run-time. In many situations we will find that a balance must be struck between the two.
This section focuses on the running time side of things. Obviously, given any problem, it is critical to find as efficient an algorithm as possible for solving it. In working toward an objective, mathematically significant definition for efficiency, we see that the primary focus is considering the worst case scenario of algorithm input and run-time. This is because the best case is often trivial and illuminates nothing, and the “average case,” although it can yield insight in many cases, is a difficult thing to define in abstraction, and therefore over-complicates things.
Firstly, for a given algorithm to be useful, we would expect it to perform better than a brute force search, or the consideration of every possible situation (in terms of the G-S stable matching algorithm, this would entail checking every single perfect matching combination). So we land at the following definition:
- An algorithm is efficient if it has a polynomial running time, meaning:
- there exist constants c > 0 and d > 0 such that for sufficiently large input instances N, the running time is bounded by cNd primitive computational steps.
Put simply, if the input size were to increase by a constant factor (e.g. double), then the running time should also only decrease by a constant factor. Examples of polynomial run time bounds are n, n log2 n, n2, and n3. Fortunately, in practice it turns out that if an algorithm exists with an impractically high order polynomial bound, such as n100, then it tends to indicate that a lower order bound indeed also exists.
One of the primary benefits of the above specific definition is its objectivity and therefore exact application from abstraction to implementation. Additionally, it is negatable.
I would rank this section a 9, it was very easy to read and very much complemented the class discussion.
2.2: Asymptotic Order of Growth
To move further into the discussion of algorithm running times, a few clarifications and simplifications are in order. First, for the most part we will be considering steps in our pseudo-code algorithms to be anything such as assigning a value to a variable, looking up an entry in an array, following a pointer, or performing an arithmetic operation on a fixed size integer. Additionally, because constants and lower order terms become inconsequential at larger inputs, we have an impetus for abstraction to highest-order polynomial upper bounds.
Asymptotic Upper Bounds: O
Given a function T(n), we say that T(n) is O(f(n)) (read T(n) is order f(n)) if, for sufficiently large n, the function T(n) is bounded above by a constant multiple of f(n). Mathematically put, that is if there exist constants c > 0 and n0 > 0 such that for all n > n0, we have T(n) =< c * f(n).
Note that there are many asymptotic upper bounds for functions, i.e. T(n) = n is both O(n2) and O(n3).
Asymptotic Lower Bounds: Omega
The definition of an asymptotic lower bound is the exact same as the upper bound, except that we say T(n) is Omega(f(n)) if, for sufficiently large n, the function T(n) is bounded above by a constant multiple of f(n).
Asymptotically Tight Bounds: Theta
If a running time T(n) is both O(f(n)) and Omega(f(n)), then we say that it is Theta(f(n)) indicating that T(n) grows exactly like f(n) to within a constant factor.
For example, consider T(n) = pn2 + qn + r.
- T(n) =< (p + q + r)n2 so T(n) is O(n2,
- T(n) >= p2 so T(n) is Omega(n2),
- Therefore T(n) is Theta(n2).
Fun stuff about bounds
- Let f and g be two functions such that limn→infty f(n)/g(n) exists and is equal to some c > 0. Then f(n) = Omega(g(n)).
- Transitivity holds for O, Omega, and Theta. E.g. if f = O(g) and g = O(h), then f = O(h).
- Sums of functions: if f = O(h) and g = O(h), then f + g = O(h). Moreover, this extends to any finite sum of functions fi where each fi = O(h).
- If g = O(f), then f + g = Theta(f). I.e. f is an asymptotically tight bound for the combined function f + g.
Asymptotic bounds for common functions
- Polynomials: growth determined by high-order term.
- Let f be a polynomial of degree d in which the coefficient ad > 0. Then f = O(nd). Moreover, f = Omega(nd) as well, so it follows that f = Theta(nd).
- Logarithms: slowly growing–bounded above by polynomials.
- For every b > 1 and every x > 0, we have logb n = O(nx).
- Can translate between bases, but often base is left out because loga n = (1/logb a) logb n, so loga n = Theta(logb n), meaning it is irrelevant when discussing bounds using asymptotic notation.
- Exponentials: of form f(n) = rn for some constant base r.
- For every r > 1 and every d > 1, we have nd = rn. That is, every exponential grows faster than every polynomial.
- For the most part, this renders algorithms useless.
This was a nice section, also a 9 out of 10. Very readable, and I particularly like how they are getting down into the mathematical foundations a bit more now.
2.3: Implementation of Stable Matching with Lists and Arrays
Our previous analysis of the G-S Stable Matching Algorithm was at a relatively high level of abstraction–finding the O(n2) bound based simply on the maximum number of iterations of the While loop. To dive deeper into analysis of the actual number of computational steps of any algorithm, we must consider the actual implementation–primarily, what data structures we will use and how they will interact. Here, we will discuss an implementation of stable matching using simply lists and arrays. First, the pros and cons of the two are weighed, and their advantages in certain situations considered (omitting from copying down here, at this point in the CSCI minor it's all ingrained pretty permanently).
Implementing
Importantly, because we know the While iteration can run n2 times in the worst-case, to have an implementation run in proportional time, we need each iteration to run in constant time. For simplicity, each man and woman are associated to a number from 1 to n at the outset. We use two arrays to represent the preference lists of men and women. Note the space required for this is O(n2), because each person must rank the n members of opposite sex. Now, we proceed to find a way to do the following in constant time:
- Identify a free man (maintain set of free men as linked list; then access to first, deletions, and additions are all in constant time)
- For man m, id the highest ranked woman to whom he hasn't proposed (single array, with one index for each man; each value initialized as “1”, then updated to always refer to index of the woman to whom he should propose next)
- For woman w, must check if she is engaged, and if so, who her current partner is (similar to above: single array with an index for each woman; values initialized to a null symbol, and then updated to be the index of the man to whom she is engaged)
- For woman w, must be able to see whether she prefers m or m' (at start of algorithm, create n x n array which contains a row for each woman whose position indices correspond to the ID's of each man; the values at those indices contain the numerical ranking that the woman gave to the corresponding man; then, given two men, we can access their respective rankings and compare them in constant time)
The last 2-d array is essentially the preference lists of the women, but ordered by the men's ID's themselves, not by ranking. Note that we can create this at the outset with a single pass through the women's preference lists–proportional to O(n2) in the first place. With the above structures, we implement the algorithm in O(n2) time.
I found this section very readable, a 9/10. Their use of the monospacing in reffering to data structures and values within them is incredibly helpful in understanding what they are trying to convey.
2.4: Survey of Common Running Times
Over the course of analyzing algorithms, a number of running times come up quite frequently, such as O(n), O(n2), and O(n log n). This is not a coincidence, as these tend to originate from very common algorithm designs and structures. We will go over these in the coming section. Additionally, note that a good starting point in much of algorithm analysis is the natural search space: the set of all possible solutions. Thus, the point of algorithm design is to find algorithms which perform better than brute force searching our way through all of them.
Linear Time
Algorithms that run in O(n) time are at most always a constant factor times the input size. Many of these end up being “one-pass” algorithms, in which we access everything and compute what we need by passing through it once. For example, simply computing the maximum of a random list of numbers.
However, as we learned in class, these could be more nuanced, as is the case of merging two sorted lists. Here, we always compare the smallest two items of each list, and remove the smaller, adding it to the end of our third merged list. The important part of proving this linear is noting that on each comparison, one of the total 2n items (supposing two lists of n length) is added to the new list. Therefore, there are only 2n iterations total.
"Linearithmic" Time
Running time of O(n log n) is also very common. Specifically, this is seen in Mergesort and other good sorting algorithms. The key property that gives rise to this time is the splitting of the input data in two, and solving the problem on each half recursively. We also saw in class how this is the time of finding the maximum time interval given random time stamps in an input. This costs O(n log n) to sort the times, then processes them through once to find maximum interval, which is linear. So, it ends up being O((n log n) + n) = O(n log n).
Quadratic Time
Quadratic time arises naturally from the Closest-Pair Problem–finding the closest pair of points given an input of many in a given plane. This is because it simply is the full natural search space: search over all pairs of input items (n2 total number of pairs), and spend constant time per pair. We will later in Ch. 5 go over a more efficient algorithm for this problem, though.
Importantly, quadratic time arises from nested loops which must each iterate through O(n) times.
Cubic and Higher Order Polynomial Time
We get higher orders of polynomial times from similar situations. For example, we arrive at O(n3) cubic time by looking through n subsets of a length n set to see if any are disjoint. This is simply a triple of nested loops (for each pair of sets, search through both looking for a pair), and the same cubic time would be attained by searching a set of planar points for the triplet with closest distance.
The both examples quickly generalizes to O(nk) time, as that occurs whenever searching over all subsets of size k (e.g. instead of looking for pairs or triplets, we look for “k-tuplets” from n planar points). A noteworthy example of this is the Independent Set problem: given a graph of n nodes, does there exist an independent set of k nodes? This requires simply searching all combinations of k nodes and seeing if they are independent (not connected via edge to another).
Beyond Polynomial Time
The Independent Set problem gets a lot more complicated much quicker if we instead try to find the independent set of maximal size. In this case we arise at the exponential O(2n) time. This arises naturally in cases where all subsets must be considered.
Lastly, the fastest growing of those that we consider is the O(n!) runtime. This factorial time arises from the natural search space of matching n items with n others (e.g. the Stable Matching search space), as well as from all possible ways of arranging n items in order. An example of the latter is the famous Traveling Salesman problem.
Sublinear Time
Back into comfortable run times, the most common sublinear time is the logarithmic O(log n) time, of which the most famous example is the binary search. Crucially, these typically require a certain pre-existing knowledge of the data set, just as the binary search requires that the data set be sorted ahead of time. So, algorithms like this can often require preprocessing.
This was a good section, 9/10. Definitely interesting and explains everything quite well.
2.5: More Complex Data Structure: Priority Queues
As we know, a priority queue (PQ) is one that maintains a set of elements, each of which has an associated numeric value, or key. The smaller this key, the higher the priority. These support insertion and deletion and selection of element with smallest key. As is shown in the book, a sequence of O(n) priority queue operations can be used to sort a list of n numbers. So, if we can get each PQ operation to work in O(log n) time, then we will have an implementation of sorting in O(n log n) time.
This PQ implementation takes the form of a heap. A heap is a binary tree whose keys are in heap order, meaning that the values of each node's children are greater than or equal to the parent node's value. An easy implementation of this is an array, where we call the first index 1 (which is the root of the heap), and then define for each parent node i, its left child is at 2i and its right child at 2i + 1.
To implement the heap operations, we define Heapify-up and Heapify-down. The former operates for a given node by checking if its parent's value is greater than its own, and if so, swapping the two. This is then repeated recursively until the condition is met or the value makes its way to the heap root. Heapify-down work's similarly, just by checking if the given node's value is greater than either of its children's values, and if so, swapping with the “smaller” of the children. Then this is done recursively down. Both of these guarantee the heap order and balance is kept upon insertion and deletion in O(log n) time due to a particular bit of cleverness: we only add or delete from the very last spot in the array. Then, to keep it balanced, say, for deletion of an arbitrary spot on the left side of the heap, we simply swap that deleted value with the value in the last spot, delete the one that was to be deleted, and then let the one from the end that we swapped into its location Heapify-up/down to its rightful place.
Implementing PQs with Heaps
Initializing the PQ array runs at most O(N), where N is a constraint on the input size by being the size of the array. Due to the above Heapify operations, we can then insert and delete any element in O(log n) time. Finding the minimum value then takes O(1) time because it is simply the root of the heap–always the first element in our array. Thus, extracting the minimum value simply means removing that minimum value, therefore takes O(log n) time.
So, as mentioned in our goal, we have found an implementation of the PQ for which each operation is O(log n). So, we can then use this to implement a sort in O(n log n) time.
This was a good section which complemented class very well. They seemed to get a little more caught in notation which led to some confusion at times, but still a 8/10 in my book.
