Chapter 2 - Basics of Algorithm Analysis

Chapter 2 - Basics of Algorithm Analysis

Section 2.1

Summary: Section 2.1 is Computational Intractability. The title is a little misleading, though, because the section is really about efficiency. Either that, or I don't know the definition of computational intractability. The authors talk about time and space efficiency, and then they progress through increasingly precise and robust definitions for efficiency, dispelling any misconceptions readers might have along the way. After the first iteration of an attempt to define efficiency, they establish that there needs to be a “mathematical view” to find a definition for efficiency that is “platform-independent, instance-independent, and of predictive value with respect to increasing input sizes” because the other definitions are too vague (p31). Then, the authors talk about worst-case running time and brute-force search. They conclude that a worst-case analysis is best because it does “a reasonable job of capturing [an algorithm's] efficiency in practice” (p31), and that brute force solutions are lazy because they do not account for the problem's structure. The section includes a helpful table that give runtimes for different algorithms and input sizes. The section closes with a conclusion that having a polynomial runtime is a good enough definition for when an algorithm is considered efficient.
My Questions: This doesn't pertain to motivation/solution/proofs/analysis, but I'd ask the authors why they ended with a third “proposed” definition for efficiency and didn't include a final, concrete definition.
Second Time Around: I like that this section references the Stable Matching Problem when addressing the concept of a “size parameter” (p31). It spelled out that N = 2n^2, where n is the number of men (or women), and how to arrive at that figure. I don't believe we addressed that explicitly in lecture, and it's always helpful when an author/instructor makes connections for you or uses previous concepts when introducing new ones.
Note to Self: I want to remember the second proposed definition for efficiency: “An algorithm is efficient if it achieves qualitatively better worst-case performance, at an analytical level, than brute-force search.” I know it's not the final definition of efficiency, but it's a good way to remember that the brute-force solution is usually not the best. It's an “intellectual cop-out,” as the authors say (p32).
Readability: I give this section a score of 9. It's short and fairly simple. I like that the authors started basic, in a “one might initially think…” scenario, and then progressed to a better, more complete picture.

Section 2.2

Summary: Section 2.2 is Asymptotic Order of Growth. This section is where we first see big-O notation. In addition to big-O notation, which is used for asymptotic upper bounds, the authors address Ω-notation (asymptotic lower bounds) and Θ-notation (asymptotically tight bounds). Then, the authors give properties of upper, lower, and tight asymptotic bounds, with associated proofs. Finally, the authors address asymptotic bounds for common functions: polynomials (all three bounds associated with highest degree (p40)), logarithms (base isn't important), and exponentials (“every exponential grows faster than every polynomial” (p42)). When addressing bounds for polynomial functions, the authors go back to polynomial time to “formalize the more elusive concept of efficiency” (p41), since they left that open-ended in Section 2.1.
My Questions: This is more of a math question, but I'd ask for an explanation of why “it follows from the definition of a limit that there is some n_0 beyond which the ratio is always between (1/2)c and 2c” (p38). It's not immediately obvious to me where the (1/2) and 2 come from.
Second Time Around: I don't believe asymptotically tight bounds were discussed in lecture, so upon reading the section that's a concept I understand.
Note to Self: It's helpful to have in my notes that “one step will consist of assigning a value to a variable, looking up an entry in an array, following a pointer, or performing an arithmetic operation on a fixed-size integer” (p35). It's not a new concept, but it's nice to put it in writing. I also want to remember the trick the authors give for determining if an asymptotically tight bound exists: if the ratio of functions f(n) and g(n) converges to a positive constant as n goes to infinity, then f(n) = Θ(g(n)) (p38).
Readability: I think this section should get a 9. There is some mathematical notation, but it's well-written and not confusing.

Section 2.3

Summary: Section 2.3 is Implementing the Stable Matching Algorithm Using Lists and Arrays. It is here where we finally talk about actually coding up the Gale-Shapley algorithm. Earlier in Chapter 2, we said that the algorithm terminates in at most n^2 iterations, but it is in this subsection that we see that it actually has a worst case run time of O(n^2) when “counting actual computational steps rather than simply the total number of iterations” (p43). That's an important point to make. If there were something at play within the iterations that runs in more than constant time, then the overall runtime of the algorithm could be more than O(n^2) because there would be n^2 iterations with action within them. The section begins by pointing out the data that needs to be maintained in the execution of the algorithm: the men's and women's rankings, status about the individuals' free-ness, and any matchings that are made. Then, the authors review the pros and cons of array vs linked list representations. The authors proceed with how to implement the algorithm, specifying along the way whether to use an array or a linked list.
My Questions: Why do they authors say finding the ith item in a linked list is O(i) runtime? I see how that is true, but I haven't seen Big-O runtimes talked about in terms of something other than n before. I think I would have expected it to be presented as: “finding an item in the linked list is O(n) in the worst case,” or something like that.
Second Time Around: I like that the book explicitly lists what needs to happen in constant time in order for the algorithm to have O(n^2) runtime overall (p 46). I think reading the implementation with array notation spelled out and specifics on how to add and remove men from the linked list of unmatched men made it easier to understand the implementation. For example, “he'll propose to w = ManPref[m,Next[m]]” is a whole lot easier for me to comprehend than talking out loud about abstract ideas. One thing that did not make more sense the second time around is how to implement the women's rankings to determine if she prefers m or m'. I think the figure from the class slides helped a lot for that topic.
Note to Self: I hadn't thought before about pre-processing data in order to make the algorithm more efficient later. I'll definitely want to keep that trick in my back pocket.
Readability: I give this section an 8, deducting points for not explaining the preprocessing of the womens' preference lists thoroughly enough.

Section 2.4

Summary: Section 2.4 is A Survey of Common Running Times. The authors begin by telling readers that it'll be advantageous to “recognize these common styles of analysis” (p47), referring to the analysis of algorithms' runtime. Here are the runtimes that the authors call “common” and address in this section: linear O(n), O(n logn), quadratic O(n^2), cubic O(n^3), O(n^k), beyond polynomial, and sublinear. For each runtime category, the authors give an example way to achieve (or end up with, if the runtime is poor) the runtime. An algorithm that can “process the input in a single pass, spending a constant amount of time on each item of input” p(48) yields linear time. The authors say that O(n logn) runtime accounts for “any algorithm that splits its input into two equal-sized pieces, solves each piece recursively, and then combines the two solutions in linear time” (p50), and they give sorting as an example. They also say that many algorithms have O(n logn) runtime simply because the input needs to be sorted. Quadratic runtime comes from “performing a search over all pairs of input items and spending constant time per pair” (p51). Nested loops each with O(n) iterations and constant work done inside give quadratic runtime for 2 loops and cubic runtime for 3 loops. The argument for O(n^2) from searching over all pairs in a set of size n gets generalized for O(n^k) time. We're told that searching over all subsets of size k from a set of size n gives O(n^k) runtime. In the section about runtimes beyond polynomial, the authors cite 2^n and n! as common bounds, and then explain their origins. Finally, we learn that sublinear runtimes can arise when
My Questions: In the subsection about cubic time, we're told to suppose that we can check in constant time whether or not a given number p belongs to a set. That sounds like a big pill to swallow, because in the sublinear time section the authors say that binary search, an improvement upon the brute force search, is O(log n). How, then, can we check that p is in the set in constant time? Also, do we still say that two nested loops are O(n^2) (if the work inside is constant) even if the inner loop doesn't iterate over everything? For example, if the inner loop is for something less than the first loop: for i in range n, for j in range(i,n).
Second Time Around: The discussions pertaining to graphs (O(n^k) and O(n^2 2^n) runtimes) were easier to digest the second time around, I think mainly because I could read the pseudocode multiple times, on paper.
Note to Self: I want to remember the following: “In general, O(log n) arises as a time bound whenever we're dealing with an algorithm that does a constant amount of work in order to throw away a constant fraction of the input” (p56). I like that phrasing, and it might help for remembering what causes O(log n) runtime.
Readability: I like that this section has example algorithms for some of the runtimes. That helps with understanding. But I do not like the issue mentioned in My Questions. For these reasons, I give the section an 8.

Section 2.5

Summary: Section 2.5 is A More Complex Data Structure: Priority Queues. The authors describe the priority queue as “one of the most broadly useful sophisticated data structures” and “a data structure that, unlike lists and arrays, must perform some nontrivial processing each time it is invoked” (p57). A priority queue maintains a set S of elements, where each element has an associated key that gives the element's priority. The smaller the key, the higher the priority. The authors state that “O(log n) time per operation is the best we can hope for” with priority queues, leaving the specifics for later in the section. They give a short survey of ways to implement a priority queue, but conclude that a heap is the best option because the other options have at least one operation that will take O(n) time. A heap is a balanced binary tree where all children are greater than or equal to their parents in terms of the magnitudes of the keys. That means the root always has the highest priority. There are two ways to implement the heap, according to the authors: either with pointers, where each node has the element, the key, and a pointer to its parent and two children; or with an array if we know ahead of time the maximum number of elements that'll be in the heap.
Problem Motivations: The motivation given for using a priority queue comes out of the need to maintain a “dynamically changing set S” and to be able to add elements, delete elements, and get to the element with the highest priority. The authors provide an example of such a situation: scheduling computer processes when they all have a priority but do not arrive in priority order.
About the Algorithms: There are two heap algorithms in this section, and they help with adding and/or deleting elements. “Heapify-up” helps with adding elements (and deleting as I'll explain momentarily). You add the new element to the final position, but it can then violate the Heap order if its key is smaller than its parent's key. So, you switch the two and call Heapify-up recursively with the node's new location. The recursion ends when the key is no longer smaller than its parent's key, or the element is at the root. The other algorithm is called “Heapify-down,” and it helps with deleting elements. When you delete an element at position i, you take the element at the last filled position and put it in position i. The Heap order can be violated if the moved element's key is smaller than its parent's key (in which case you Heapify-up), or if it is larger than either of its children's keys. In the latter case, you would Heapify-down, which involves switching the element with its smaller child, and calling Heapify-down recursively with the element's new location. The recursion ends when the key is no longer larger than either of its children's keys or the element is a leaf node. Both adding and deleting elements to a priority queue with a heap structure have O(log n) runtime because of the balanced binary characteristic of the heap.
My Questions: I'm still, after class and reading, confused about the difference between w = H[j], key(w), and j. I thought i and j represent positions in the heap. The book seems to use both w and key(w) to represent the elements, which are the values in the node's circle. Is it that elements are more complex than I'm thinking and the figures with tree representations do not show their values, but rather only the keys in the circles of the nodes?
Second Time Around: Unfortunately, not a lot makes more sense based on reading this section. I think the chart on slide 27 or 28 from Monday 1/22 class might have cleared things up, if the answer to my question above is “yes” (“…the figures with tree representations do not show their values…?”).
Note to Self: I didn't know before that you could do a proof by reverse induction (p64).
Readability: 5, because I'm still confused, and I shouldn't still be confused after class + reading.

Table of Contents

Chapter 2 - Basics of Algorithm Analysis

Section 2.1

Section 2.2

Section 2.3

Section 2.4

Section 2.5