Table of Contents
Chapter 2
These two sections of chapter two deals with the efficiency of algorithms, as well as how an algorithm's running time changes as its input size changes. Both of these are important concepts, as algorithms must work quickly no matter the scale of the problem.
Section 2.1(Computational Tractability)
This section is all about efficiency, even taking several attempts at defining it. The first attempt reads as follows:
“Proposed Definition of Efficiency (1): An algorithm is efficient if, when implemented, it runs quickly on in real input instances.“
This definition does not work because it has no concept of scale. An inefficient process could meet this standard if run on a very small data sample, since the computer could process things quickly. The next attempt reads as:
“Proposed Definition of Efficiency (2): An algorithm is efficient if it achieves qualitatively better worst-case performance, at an analytical level, than brute-force search.“
This definition is simply too vague to work. It does not provide guidelines for that counts as “qualitatively better worst-case performance”, and is thus subjective. The final definition fixes this vagueness by providing a way of quantifying efficiency:
“Proposed Definition of Efficiency (3): An algorithm is efficient if it has a polynomial running time.“
This is the ideal definition for our purposes because of its specificity. It allows us to look at algorithms quantitatively so that we can determine whether a process is feasible. This definition also makes problems negatable. Thus, using this definition, we can decide when there is no efficient algorithm for a problem.
Section 2.2(Asymptotic Order of Growth)
This section looks at how an algorithm's worst-case running time grows as the input size grows. For our purposes as computer scientists, we do not need to stress out about finding the exact number of steps it takes to solve a problem. Instead, we make accurate generalizations in the form of O-notation so that algorithms' run times become easier to compare. This section begins by describing the specific process through which we can define an algorithm's upper and lower bounds, then describes how to find an algorithm's “tight bounds”. An algorithm has tight bounds if its upper-bound and lower-bound are the same function. This bound can sometimes also be found by calculating a limit as n goes to infinity.
The book then goes on to mention some of the properties of the asymptotic growth rates. It discusses transitivity, and sums of functions.
Section 2.3 (Implementing Stable Matching Algorithm)
This section practically applies all of the ideas we have discussed thus far. It proves that the Stable Matching problem can be solved by an algorithm that runs in O(n2) time.
But first, the book discusses the nuances of two basic data structures that will be imperative to this algorithm: array lists and linked lists. Array lists are good for finding the ith item in the list in constant time. However, they do not resize dynamically and thus end up taking more space than they need. The book decides that the men and women's preferences would be well contained in an array list because the list will never change sizes, and being able to find the ith item in constant time would be useful. Linked lists, on the other hand, must find the ith item in O(i) time because it has to follow the links all the way to node that contains the item. However, linked lists are dynamically sized, so the would be a good implementation for storing the list of single men. Since the man you start with is arbitrary anyway, we can access him in O© time, so the time penalty is rendered inert.
This section goes on to give a basic implementation of the algorithm. In order to keep runtime under O(n2), we must be able to perform four operations in constant time:
(1) Identify a free man. (2) For any given man, identify the highest-ranked woman to whom he has not yet proposed. (3) For any given woman, we must be able to identify whether or not she has a partner, and identify her partner. (4) When presented two men, decide which of the two a given woman prefers.
If these four operations can be performed in constant time, the algorithm will perform correctly in O(n2) time.
Overall, this section was more interesting than others because it practically applied the theories and proofs we have discussed.
Section 2.4(Common Running Times)
This section covers the primary running times, and how to achieve them. Throughout the section, it primarily compares each running time to the “brute force” method.
It starts off with the most intuitive running time, linear (O(n)). This algorithm is relatively easy to achieve, as just limiting the algorithm to one pass over the data. A single for-loop will result in linear running time, as long as there is not another loop nested within.
The next running time discussed is O(n log n). This running time is marginally worse than linear. Algorithms that operate under this running time usually sort the data first in O(log n) time, and then deal with the sorted data in O(n) time. The result yields O(n log n).
The section then discusses quadratic running time (O(n2)). This running time naturally arises out of nested for-loops, i.e. an O(n) operation repeated an O(n) number of times. The Closest-Pair problem discussed runs in O(n2) running time. Cubic time, similarly, occurs because of triple-nested for-loops. It is also polynomial running time, but significantly more costly than quadratic time.
Next, the section moves on to running times beyond polynomial. These running times are too inefficient to handle any significant amount of data. An example of this is exponential running time, which happens when we have to iterate over every subset of a a data set. This causes us to have to perform O(2n) operations.
Section 2.4 ends with a brief discussion of sub-linear running times, most notably logarithmic time (O(logn)). This running time happens when you are able to reduce the amount of data by a factor of two each time you iterate through the data. Binomial search runs in logarithmic time.
Section 2.5(Priority Queues)
This section mainly discusses the priority queue and its implementation.
The optimal data structure for implementing a priority queue, according to this section, is a heap. Heaps are tree structures in which each node can have two children, each one being equal or higher in value. Because of this form, we can access the lowest value in constant time because the root node, by definition, is the lowest value.
Throughout the process of working with the heap, its structure will inevitably become compromised from either adding an item of a lower value to a leaf node or by removing an item from higher in the tree. We fix these issues with the heapify-up and heapify-down methods. Heapify-up works by switching the problem node up towards the root node, until it reaches a point where it is no longer lower than its parent node. This is primarily used for reorganizing a heap in which a low number has been added to the frontier. Heapify-down works by doing the opposite; we switch a parent node with the lower of its two children, and repeat the process until either the node is higher than both of its new children or it becomes a leaf node.
Both of these methods run in O(logn) time. This is possible because of the heaps structure; since each layer of the heap grows twice as large as the previous one, the structure of the heap itself becomes logn relative to the entire number of nodes. So when we traverse up or down the entire depth of the tree, we can only traverse logn times.
Other than when the heap is initialized, every heap method runs in sub-linear time.
