This is an old revision of the document!

Chapter 2

Section 2.1 : Computational Tractability

This section is all about defining efficiency and deciding which algorithms actually are efficient and how efficient they are. An algorithm is not necessarily efficient if it runs quickly, and not necessarily inefficient if it runs slowly. Efficiency has to do with the best case and worst case for each individual algorithm independent of platform or instance. We start by comparing the worst case performance with the performance of the brute-force search, in which the computer runs through all possibilities before it returns a solution. If the worst-case performance is qualitatively better than the brute-force performance, the algorithm can be labeled as efficient. This definition is still vague, so we move on to a more concrete method of determining efficiency–polynomial running time.

When the input size increases by a constant factor, the algorithm should only slow down by some constant factor C. Suppose an algorithm has the property that for some constants c > 0 and d > 0 such that for every input instance of size N, its running time is bounded by cN^d primitive computational steps (single assembly language instructions). If this running bound holds, we can say that the algorithm has polynomial running time, and is therefore efficient. While this definition of efficiency might seem too narrow, it really does hold true for algorithms with polynomial time. These algorithms almost always have running times proportional to moderately growing polynomials such as n, nlogn, n², or n³. Conversely, problems with no polynomial-time algorithm tend to be very difficult in practice, but exceptions do exist for both cases. These definitions are not perfect and do not work every time, but by and large polynomial time and worst-case behavior are good measures of efficiency in algorithms.

Section 2.2 : Asymptotic Order of Growth

In the previous section, we defined efficiency based on an algorithm's worst-case running time on inputs of size n which grows at a rate that is at most proportional to some function f(n). Thus, f(n) is a bound on the running time of the algorithm. In this section, we seek to express the growth rate of running times and other functions in a way that is insensitive to constant factors and low-order terms.

Asymptotic Upper Bounds
Let T(n) be a function. Given some other function f(n), we say that T(n) is O(f(n)) if for sufficiently large n, the function T(n) is bounded above by a constant multiple of f(n), or T(n) = O(f(n)). Specifically, T(n) is O(f(n)) if there exist constants c > 0 and n₀ ≥ 0 so that for all n ≥ n₀, we have T(n) ≤ c⋅f(n). In this case, we will say that T is asymptotically upper-bounded by f.

For example:
Consider an algorithm whose running time has the form T(n) = pn² + qn + r for positive constants p, q, and r. We’d like to claim that any such function is O(n²). To see why, we notice that for all n ≥ 1, we have qn ≤ qn², and r ≤ rn². So we can write T(n) = pn² + qn + r ≤ pn² + qn² + rn² = (p + q + r)n² for all n ≥ 1. This inequality is exactly what the definition of O(⋅) requires: T(n) ≤ cn², where c = p + q + r.

Asymptotic Lower Bounds
We want to show that this upper bound is the best one possible. To do this, we want to express the notion that for arbitrarily large input sizes n, the function T(n) is at least a constant multiple of some specific function f(n). Thus, we say that T(n) is Ω(f(n)) if there exist constants ε > 0 and n₀ ≥ 0 so that for all n ≥ n₀, we have T(n) ≥ ε⋅f(n).

Asymptotically Tight Bounds
If a function T(n) is both O(f(n)) and Ω(f(n)), we say that T(n) is Θ(f(n)). In this case, we say that f(n) is an asymptotically tight bound for T(n). For example, the analysis above shows that T(n) = pn² + qn + r is Θ(n²).

Properties of Asymptotic Growth Rates
Transitivity
If a function f is asymptotically upper-bounded by a function g, and if g in turn is asymptotically upper-bounded by a function h, then f is asymptotically upper-bounded by h.
(a) If f = O(g) and g = O(h), then f = O(h). (b) If f = Ω(g) and g = Ω(h), then f = Ω(h).

Sums of Functions
If we have an asymptotic upper bound that applies to each of two functions f and g, then it applies to their sum.
Suppose that f and g are two functions such that for some other function h, we have f = O(h) and g = O(h). Then f + g = O(h).

It frequently happens that we’re analyzing an algorithm with two high-level parts, and it is easy to show that one of the two parts is slower than the other. We’d like to be able to say that the running time of the whole algorithm is asymptotically comparable to the running time of the slow part.
Suppose that f and g are two functions (taking non-negative values) such that g = O(f). Then f + g = Θ(f). In other words, f is an asymptotically tight bound for the combined function f + g.

Asymptotic Bounds for Some Common Functions
Polynomials
Let f be a polynomial of degree d, in which the coefficient a_d is positive. Then f = O(n^d).

Logarithms
For every b > 1 and every x > O, we have log_bn = O(n^x).

Exponentials
For every r > 1 and every d > O, we have nn^d = O(rⁿ).

Taken together, then, logarithms, polynomials, and exponentials serve as useful landmarks in the range of possible functions that you encounter when analyzing running times. Logarithms grow more slowly than polynomials, and polynomials grow more slowly than exponentials.

These sections didn't make a ton of sense, but I think reading the sections and writing summaries helped. I think I will better understand what's going on after this week's assignment because I learn best by doing as opposed to reading. This section was not as readable as the last chapter, I would give it a 5/10 for readability. I don't really have any specific questions right now but I'm not really sure how these specific things pertain to algorithms because I haven't done the assignment yet or seen them in context. I think I'll have some questions maybe as I do the assignment.

Section 2.3 : Implementing the Stable Matching Algorithm Using Lists and Arrays

In order to asymptotically analyze the running time of an algorithm, we don't necessarily have to program, compile, and execute it. Instead, we must think about how the data will be represented and manipulated in an implementation of the algorithm, so as to bound the number of computational steps it takes. In this representative example, we analyze the data structures needed to implement the Stable Matching Problem.

Arrays vs. Lists

Arrays
The simplest way to keep a list of n elements is to use an array A of length n, and have A[i] be the i^th element of the list. This array has the following properties:

The question “What is the i^th element in the list?” can be answered in O(1) time by direct access to the value A[i].
In an unordered list, if we want to determine if a specific element e is in the list, we need to check all the elements one by one in O(n) time.
If the array elements are sorted, we can determine whether a specific element e is in the list in O(log n) time using binary search.

An array is less good for dynamically maintaining a list of elements that changes over time because it is more difficult to add or delete elements of a list that is maintained as an array.

Lists
A preferable way to maintain a dynamic set of elements (one that changes size) is via a linked list. In a linked list, the elements are sequenced together by having each element point to the next in the list. This means that for each element in the list, we need to maintain a pointer to the next element. When the set of all possible elements may not be fixed in advance we can implement this linked list by allocating a record e for each element that we want to include in the list. If we want the list to be traversable in both directions, we create a doubly linked list. A doubly linked list can be modified as follows:

Deletion: To delete the element e from a doubly linked list, we can just “splice it out” by having e.Prev and e.Next point to each other
Insertion: To insert element e between elements d and f in a list, we “splice it in” by updating d.Next and f.Prev to point to e, and the Next and Prev pointers of e to point to d and f, respectively.

Lists have disadvantages as well. Unlike arrays, we cannot find the i^th element of the list in O(1) time: to find the i^th element, we have to follow the Next pointers starting from the beginning of the list, which takes a total of O(i) time. Given the advantages and disadvantages of arrays and lists, we may want to convert one format to the other depending on the specific input to a problem. In this case, it is easy to convert between the array and list representations in O(n) time.

Implementing the Stable Matching Algorithm
To ensure the run time of this algorithm is at most O(n²), we need to be able to do each of the four things in constant time.

1. Identify a free man.
2. For a man m, identify the highest-ranked woman to whom he has not yet proposed.
3. Decide if a woman w is currently engaged, and if she is, identify her current partner.
4. For a woman w and two men m and m', decide which of m or m' is preferred by w.

We can do all four of these things in constant time using linked lists, arrays, and an n x n array.
This section was a nice review of some common data structures I have learned about as a CS major and I think reading the text will continue to give me a better idea of how and when to implement different data structures. I have always had a hard time remembering the run time of common algorithms and I can already tell that I am finally getting a better feel for this. I liked this section because it helped me to gain a better understanding on how to “calculate” the run time of an algorithm by looking at the internal processes. I'm glad I read the section after class today because I don't think my understanding would have been as good if I had read it before. I would give this section an 8/10 for readability.

Section 2.4 : A Survey of Common Running Times

This section provides a survey of common running times such as O(n), O(n log n), and O(n²), and some of the typical approaches that lead to them.

Linear Time
An algorithm that runs in O(n), or linear, time has the property that its running time is at most a constant factor times the size of the input. Examples of algorithms with linear time are computing the maximum and merging two sorted lists, which can also be computed as O(n²) time.

O(n log n) Time
O(n log n) is the running time of any algorithm that splits its input into two equal-sized pieces, solves each piece recursively, and then combines the two solutions in linear time. A well known example of this type of algorithm is sorting, in particular, Mergesort. There are many algorithms whose most expensive step is to sort the input, so O(n log n) is a relatively common running time.

Quadratic Time
Performing a search over all pairs of input items and spending constant time per pair is a common way which a running time of O(n²) arises. Quadratic time also arises naturally from a pair of nested loops: An algorithm consists of a loop with O(n) iterations, and each iteration of the loop launches an internal loop that takes O(n) time. Multiplying these two factors of n together gives the running time. The brute force algorithm for finding the closest pair is a common example of this kind of algorithm.

Cubic Time
More elaborate sets of nested loops often lead to algorithms that run in O(n³) time. For example, the algorithm for the problem “for pair of sets S_i and S_j, determine whether S_i and S_j have an element in common” has three nested for loops that each run in linear time. There are algorithms for this problem that improve on O(n³) running time, but they are complicated and it is not clear whether the improved algorithms are practical on inputs of reasonable size.

O(n^k) Time
We obtain a running time of O(n^k) for any constant k when we search over all subsets of size k. The problem of finding independent sets in a graph has O(n^k) time. For some fixed constant k, we want to know if a given n-node input graph G has an independent set of size k. Since we are treating k as a constant, this quantity is O(n^k).

Beyond Polynomial Time
Two kinds of bounds that come up often are 2ⁿ and n!. For example, we are given a graph and want to find an independent set of maximum size. The outer loop in this algorithm will run for 2ⁿ iterations as it tries all these subsets. Inside the loop, we are checking all pairs from a set S that can be as large as n nodes, so each iteration of the loop takes at most O(n²) time. Multiplying these two together, we get a running time of O(n²2ⁿ). Thus we see that 2ⁿ arises naturally as a running time for a search algorithm that must consider all subsets. Search spaces of size n! tend to arise for one of two reasons. First, n! is the number of ways to match up n items with n other items. The function n! also arises in problems where the search space consists of all ways to arrange n items in order. A common example of this is the Traveling Salesman Problem: given a set of n cities, with distances between all pairs, what is the shortest tour that visits all cities?

Sublinear Time
There are cases where one encounters running times that are asymptotically smaller than linear. These situations tend to arise in a model of computation where the input can be “queried” indirectly rather than read completely, and the goal is to minimize the amount of querying that must be done. The best-known example of this is the binary search algorithm. The running time of binary search is O(log n), because of this successive shrinking of the search region. In general, O(log n) arises as a time bound whenever we’re dealing with an algorithm that does a constant amount of work in order to throw away a constant fraction of the input.

I liked this section because it gave a good overview of some run times that I see frequently and examples of algorithms that would have these run times. I had a hard time remembering which run times went with which common algorithms in CSCI 112 and thought this class would be equally difficult for me, but the amount of time and detail spent covering the basics is definitely what I needed. I think this wiki will be particularly helpful for reference when I need to jog my memory. I would give this section an 7/10 for readability because the subtitles were helpful but additional organization or subtitles for each example would have also been helpful. A table with each run time and some common examples could have also been helpful.

Section 2.5 : A More Complex Data Structure: Priority Queues

The priority queue is one of the most broadly useful sophisticated data structures. Here, it is a useful illustration of the analysis of a data structure that, unlike lists and arrays, must perform some nontrivial processing each time it is invoked. A priority queue is designed for applications in which elements have a priority value, or key, and each time we need to select an element from a set S, we want to take the one with highest priority. A priority queue is a data structure that maintains a set of elements S, where each element v ∈ S has an associated value key(v) that denotes the priority of element v; smaller keys represent higher priorities. Priority queues support the addition and deletion of elements from the set, and also the selection of the element with smallest key. With a priority queue that can perform insertion and the extraction of minima in O(log n) per operation, we can sort n numbers in O(n log n) time.

Theorem 2.11 A sequence of O(n) priority queue operations can be used to sort a set of n numbers.

A Data Structure for Implementing a Priority Queue
The Definition of a Heap
The heap data structure combines the benefits of a sorted array and list for purposes of this application. Conceptually, a heap is a balanced binary tree. The keys in such a binary tree are said to be in heap order if the key of any element is at least as large as the key of the element at its parent node in the tree. We can avoid using pointers to represent a heap if a bound N is known in advance on the total number of elements that will ever be in the heap at any one time. Such heaps can be maintained in an array H indexed by i = 1 ….. N. We will think of the heap nodes as corresponding to the positions in this array. H[1] is the root, and for any node at position i, the children are the nodes at positions leftChild(i) = 2i and rightChild(f) = 2i + 1.

Implementing the Heap Operations
To add elements to the heap, we start by adding the new element v to the final position i = n + 1, by setting H[i] = v. Unfortunately, this does not maintain the heap property, as the key of element v may be smaller than the key of its parent. We must use the following algorithm, called Heapify-up:

Heapify-up (H, i):
    If i > 1 then
        let j = parent(i) = ⌊i/2⌋
        If key[H[i]] < key[H[j]] then
            swap the array entries H[i] mad H[j]
            Heapify-up (H, j)
        Endif
    Endif

Theorem 2.12 The procedure Heapify-up(H, i) fixes the heap property in O(log i) time, assuming that the array H is almost a heap with the key of H[i] too small. Using Heapify-up we can insert a need element in a heap of n elements in O(log n) time.

To delete an element from the heap is a bit more complicated, and many applications of priority queues don’t require the deletion of arbitrary elements, but only the extraction of the minimum. In a heap, this corresponds to identifying the key at the root and then deleting it; we will refer to this operation as ExtractMin(H). Here we will implement a more general operation Delete(H, i), which will delete the element in position i. For a heap with n elements, after deleting the element H[i], the heap will have only n - 1 elements; and not only is the heap-order property violated, there is actually a “hole” at position i, since H[i] is now empty. To patch this hole in H, we move the element w in position n to position i. After doing this, H at least has the property that its n - 1 elements are in the first n - 1 positions, as required, but we may well still not have the heap-order property. The key of element w may be either too small or too big for the position i. If the key is too small, we can use Heapify-up. If the key is too large, we must use the following algorithm, called Heapify-down:

Heapify-down (H, i):
    Let n = length(H)
    If 2i > n then
        Terminate with H unchanged
    Else if 2i < n then
        Let left = 2i, and right = 2i + l
        Let j be the index that minimizes key [H [left]] and key [H [right]]
    Else if 2i = n then
        Let j = 2i
    Endif
    If key[H[j]] < key[H[i]] then
        swap the array entries H[i] and H[j]
        Heapify-down (H, j)
    Endif

Theorem 2.13 The procedure Heapify-down(H, i) fixes the heap property in O(log n) time, assuming that H is almost a heap with the key value of H[i] too big. Using Heapify-up or Heapify-dovn we can delete a new element in a heap of n elements in O(log n) time.

Implementing Priority Queues With Heaps
The heap data structure with the Heapify-down and Heapify-up operations can efficiently implement a priority queue that is constrained to hold at most N elements at any point in time. Here we summarize the operations we will use and their run times.

StartHeap(N) returns an empty heap H that is set up to store at most N elements. This operation takes O(N) time, as it involves initializing the array that will hold the heap.
Insert(H, v) inserts the item u into heap H. If the heap currently has n elements, this takes O(log n) time.
FindMin(H) identifies the minimum element in the heap H but does not remove it. This takes O(1) time.
Delete(H, i) deletes the element in heap position i. This is implemented in O(log n) time for heaps that have n elements.
ExtractMin(H) identifies and deletes an element with minimum key value from a heap. This is a combination of the preceding two operations, and so it takes O(log n) time.

Table of Contents