====== Chapter 2: Algorithm Analysis ====== ===== 2.1 Computational Tractability ===== "[Many of the problems we study] will involve an implicit search over a large set of combinatorial possibilities, and the goal will be to efficiently find a solution that satisfies certain clearly delineated conditions.") This section reads like a Socratic dialogue: it attempts to define something, refutes its own proposed definitions, and (unlike a Socratic dialogue) finally agrees on a definition. An algorithm is considered "efficient" if it has a polynomial running time. ===== 2.2 Asymptotic Order of Growth ===== As input size grows, the worst-case running time grows at a rate proportional to some function f(n). This function becomes a BOUND on the running time of the algorithm. There are three types of asymptotic bounds: upper bounds (big-O), lower bounds (omega), and tight bounds (theta). Asymptotic growth rates have the properties of additivity and transitivity. Some propositions/proofs are given about some common types of functions (polynomials, logarithms, and exponentials). Polynomials are O(n^2), logarithms are O(log n), and exponentials are O(r^n). ===== 2.3 Implementing the Stable Matching Algorithm Using Lists and Arrays ===== The Gale-Shapley Stable Matching algorithm terminates in at most n^2 iterations (we learned this earlier). It would be nice to construct an algorithm that is O(n^2). For this to work, every iteration has to be implemented in constant time. It is important for programmers to consider the data structures they will be using; they are chosen for efficiency and for ease of implementation. The G-S algorithm can be implemented using arrays and linked lists. A section is dedicated to an overview of these data structures (including their advantages and disadvantages) but we know about them by now. ===== 2.4 A Survey of Common Running Times ===== Introduces the notion that "a very small number of distinct reasons" are responsible for the recurrence of familiar runtime bounds -- like O(n) and O(n log n) -- in algorithm analysis. According to the authors it is a long-term goal to be able to recognize "these common styles of analysis." It may be helpful to contrast an algorithm's current runtime bounds with the bounds of some brute force algorithm. You are constantly trying to improve your algorithm; referring to the efficiency of a brute force approach might help you get your bearings, and familiarize yourself with the "landscape" of the particular problem at hand. The familiar runtime bounds described in this section are as follows... * Linear time * O(n log n) time * Quadratic time * Cubic time * O(n^k) time * Beyond polynomial time (ex: 2^n and n!) * Sublinear time I am not great at determining the better bounds when you get into factorials and n^ks (confusing!). I was wondering: Say one algorithm (A) is super effective for n<100 but awful for n>=100. Another algorithm (B) is average: it is worse than A for n<100 but it's better than A for n>=100. Can you check the sample size at the beginning, and do algorithm A for n<100, algorithm B for n>=100? Is this ever the case? (My guess would be that sometimes you can't know the sample size, so assuming it will be large is appropriate.) ===== 2.5 A More Complex Data Structure: Priority Queues ===== Priority queues store elements with priority values. Removing from a priority queue returns the element with highest priority. Priority queues can be useful tools for scheduling problems, like the processes on a computer. Proposition 2.11 says a priority queue can sort n numbers in O(n) time. This is because it takes n steps to insert the numbers and n steps to remove the numbers (this is 2n in total, or O(n)). The section then explains the processes "heapify-up" and "heapify-down" which we received handouts for. The underlying heap structure of priority queues allows for insertion and deletion to occur in O(log n) time.