Table of Contents
Chapter 2: Basics
Section 1: Computational Tractability
- Summary:
This section is about the importance of finding efficient algorithms and what it means for an algorithm to be efficient. This book focuses on time efficiency, but also notes that efficient use of space (memory) is also important. We want a definition of efficiency that is platform-independent, instance-independent, and able to predict how the running time changes with increasing input sizes. When we analyze the running time, we'll look at worst-case running times, because best-case and average-case don't really tell us much. An initial comparison to decide how “good” an algorithm's running time is can be made against the running time for the “brute-force” search for solutions (searches all possible solutions to find the correct one). For example with our stable matching problem, there are N! possible perfect matchings (everyone paired up, but not necessarily stable) for N men and N women, but our algorithm should be able to run in a time proportional to N. According to definitions from the 1960's, a good algorithm only grows in running time by some constant factor if the input size is increased multiplicatively. Algorithms with polynomial running times achieve this. This mathematical definition is the best, even though there are a few issues like really really big constant multipliers that make the polynomial running time not so great after all and some instances where exponential running times turn out to work pretty well in almost all cases.
- Insights and Questions:
Wait I thought our algorithm for the stable matching problem took O(N^2) time? [I don't really have any other questions or insights - do I need to come up with more even when I don't have them?] I found the discussion of why average-case running times aren't actually useful to be particularly interesting. It reminded me of a lot that we've discussed regarding our research … that random testing doesn't actually get at we want to know, but how the random test cases were generated (kind of).
- Readability:
Again, this section was very readable. I thought that they did a great job of explaining why we need the mathematical definition of efficiency and how we get to it. And I read this after our classes on t, and I think that was good for me, because I already knew what to expect.
- Sara Says:
“Wait I thought our algorithm for the stable matching problem took O(N^2) time?” –> Yes, that's correct. They are saying the simplest algorithm is “Given all possible perfect pairings (N!), find one that is stable and that's the solution.” So, that's O(N!). But, Gale-Shapley is a more efficient algorithm.
Section 2: Asymptotic Order of Growth
- Summary:
Our mathematical definition of efficiency gives us a way to compute the number of “steps” that an algorithm takes in its worst case. So we might end up with 1.62n^2+3.5n+8 “steps” (the book's example). But this isn't really useful because the “steps” take different amounts of machine-level “steps” depending on exactly what's being done (even though we are defining a “step” to be something that takes constant time/a constant number of machine level steps) and it's also too detailed/precise. So what we really want is to be able to say that function like this grows like n^2 and ignore all of the constants and lower order terms. Big O, Big Omega, and Big Theta notation allow us to do this. The book formally defines them (as we did in class). Big O is for an algorithm/function are it's asymptotic upper bounds (for n's above some cutoff, these values of the Big O functions will always be greater than the initial function). Big Omega is for its asymptotic lower bounds, and Big Theta is a tight bound. The transitive property works so that if f=O(g) and g=O(h), then f=O(h) BUT that doesn't mean that f=g. This also works for things like f=O(g) and g=O(h) so f=O(h). When functions are added, we get situations like f=O(h) and g=O(h) so f+g=O(h). The book gives proofs for these, and we did brief proofs in class, so I won't go into them. Polynomials' Big Theta is the highest degree of n. You don't need a base for logarithms. Logarithms are better than polynomials, polynomials are better than exponential.
- Insights and Questions:
None, really. These were explained well in class, this was a good review, and it's all pretty math-y, which makes it clear.
- Readability:
This section was pretty read-able, but I just skimmed over the proofs (I might revisit them later), because I felt that I had a good understanding of them from what we've discussed in class. Obviously, that kind of math is harder to read in general, but I think they're presented in a really clear format.
Section 3: Implementing the Stable Matching Algorithm Using Lists and Arrays
- Summary:
You don't have to actually program to be able to analyze the running time of algorithms, but you do need to be able to think about what data structures you would/could use (and how they behave). Some data structures will be better than others for different algorithms. Data structure choice is up to the designer! Sometimes it's important to consider the preprocessing of input. Arrays vs. Lists → The book goes into the running times of various list/array operations. That information (in a more compact form) is also in our class notes. Important note: easy to convert between lists and arrays (O(n) time), so they can be used together! For the G-S Algorithm for the stable matching problem, we said earlier that it should complete in O(n^2) time, BUT in order for this to work, every step has to complete in constant time. The book goes into how to do this (i.e. what data structures to use to represent what and why). We did that pretty extensively in class so see notes or p. 45-47 for that info.
- Insights and Questions:
Linked lists to represent the mens' preference lists/proposed to lists in the stable matching problem work, too (we talked about that in class). Is it always possible to make every “step” happen in constant time? We had to use some not-so-intuitive data structures (or use them in not-intuitive ways) to get this to work for this problem. How do you design pseudocode algorithms with this in mind? How/when do you decide for sure that you're not going to be able to make something happen in constant time?
- Readability:
Awesome! It made a great review of what we did in class!
Section 4: A Survey of Common Running Times
- Summary:
Particular styles of approaches to a problem correspond/lead to certain common running times. First we get a sense of what the brute-force solution to the problem would take (which is based on the “search space”). The rest of the book will concentrate on improving upon the brute-force solution.
Linear (O(n)) Time: Comes from “one-pass” style of processing input. That means each element receives a constant amount of time/attention. Ex. computing the maximum or merging two sorted lists (the book goes into these algorithms, but I think they're pretty easy, and they're in our class notes).
O(nlogn) Time: You get O(nlogn) time for algorithms that split the input into two equal-sized pieces, solve each piece recursively, and combine the two solutions in linear time. Ex. sorting (especially mergesort). Algorithms whose most expensive step is sorting the input also have this running time (ex. problems where you first need to sort the input and THEN you can do something on the input that takes O(n) time or constant time or anything less than what it took to sort still have an O(nlogn) running time.
Quadratic Time: Quadratic time comes from a pair of nested loops. The outside loop runs O(n) times, each time starting another loop that runs in O(n) time. The total running time is O(n^2). Ex. finding the closest pair of points in a plane using a brute-force algorithm. You need to find all of the pairs of points (O(n^2) time) and compute the distance between them (constant time). That's not the best way to do that problem, though. Later the book will explain how to do it in O(nlogn) and even O(n) time.
Cubic Time: More elaborate sets of nested loops can lead to O(n^3) time. Ex. given a bunch of subsets of a set, figuring out if any pair of those subsets is disjoint takes O(n^3). The algorithm for that is on p. 52-53.
O(n^k) Time: It takes O(n^k) time to find k-element subsets of a set of n items (just like it takes O(n^2) time to find pairs). The book goes into an an example of finding an independent set of size k in a graph. Finding all of the k-element subsets takes O(n^k) time, and processing each of those subsets (i.e. checking each pair of nodes within the subset to see if there's an edge between them) takes O(k^2) time. That gives us an O(k^2 * n^k) running time, but since k is constant, k^2 is also constant so we end up with an O(n^k) running time. Again, this isn't necessarily the best way to go about solving this problem, but the more efficient solutions are for later.
Beyond Polynomial Time: Common running times that increase faster than polynomial are 2^n and n!. Ex. finding an independent set of maximum size in a graph - you not only have the O(n^k) time, you have to look at subsets of ALL possible lengths, so that ends up taking O(n^2 * 2^n) time! Even worse are n! running times. n! search spaces arise in 2 ways: (1) It's the number of ways to match n items with n other items (ex. the possible perfect matchings in our stable matching problem) and (2) It's the number of ways to arrange n items in order (ex. the traveling salesman problem - what order should he visit the cities in? –> believed to have no efficient solution).
Sublinear Time: Problems that take less time than linear. You don't have to look through all of the input, just need to “query.” So the problem is minimizing the “querrying.” Ex. binary search algorithm doesn't look at every piece of input and it runs in O(logn) time. O(logn) time comes from algorithms that do a constant amount of work to throw away a constant fraction of the input until the input has been shrunk down to a constant size and can be solved directly.
- Insights and Questions:
I am really interested to see how some of these problems are solved in less time. Also, I am intrigued by the problems that can't be solved efficiently, in particular what makes the difference between a problem that can and and a problem that can't be solved efficiently? Crazy stuff! From our HW, though, how do you get some of those crazy running times that are in the first problem?!?
- Readability:
I thought this section was great. The breakups into different asymptotic running times and the connections between the running times (ex. that the “formula” for the time it takes to find k-element subsets is basically the same as the formula for finding pairs)!
Section 5: Priority Queues
- Summary:
Good algorithms are important to get efficient solutions, but sometimes you need more complex data structures in order to implement the algorithms efficiently (or make improvements to the algorithm). Some data structures are designed for specific problems, but some are more broadly applicable - like the priority queue, which is a data structure that “must perform some nontrivial processing each time it is invoked.” Priority queues are useful for when you need to maintain a changing set (like the free men in our stable matching problem) and be able to add, delete and select certain elements. Elements in a priority queue have a priority value (key). Priority queues allow you to add to the set, delete from the set, and select the element with the smallest key. Useful for scheduling real-time events (ex. processes on a computer). Each operation (add, delete, select min) should take O(logn) time. The problem of using the priority queue to sort elements in O(nlogn) time shows/“proves” that O(logn) per operation is the right thing to aim for. We use a heap to implement a priority queue because using lists or arrays doesn't work (something has to take O(n) time instead of O(logn) time). A heap is like a balanced binary tree with every parent smaller than either of its children (but it's actually usually maintained in an array - explanation for how that works is in our class notes or on p.60). Operations: heapify-up and heapify-down work in O(logn) time (explanations for how that works are on p.61-63 and in our notes). Priority queues are implemented using heaps - insert uses heap's heapify-up. The minimum element (highest priority) is always at the root of the heap. Delete uses heapify-down. (Running times and a few more operations are on p.64 and in our notes).
- Insights and Questions:
Interesting distinction between arrays/lists and “complex” data structures like priority queues/heaps/etc. I never really thought of these as being considerably more complex. I am really curious about other complex data structures. When is it ever useful to make up a data structure just for one type of problem (like the book suggests happens occasionally)? I feel like I need an example of how that could happen.
- Readability:
This section was a little harder to read than others. It flowed well, but the logic of jumping from arrays to priority queues to heaps wasn't quite as clear or as seamless as other sections or as clear as it was in class. I think that actually seeing someone do heapify-up/down is really important and you just can't get that from diagrams that don't move.