This is an old revision of the document!

Chapter 2

2.1: Computational Tractability

Summary of the Section

The two main components of computational efficiency are efficiency in running time and the amount of space (memory) used by an algorithm. A good way to gauge the efficiency of an algorithm is to analyze its worst case runtime. Additionally, in comparison to a brute search algorithm, an algorithm should be more efficient and have more of an intellectual weight or flavor to the structure of the algorithm. The text then settles on a definition of efficiency: an algorithm is efficient if it has a polynomial running time. Finally, we find out that problems that have polynomial-time algorithms as a solution have moderately growing polynomials.

Notes on the Section

Worst case analysis of an algorithm does a reasonable job of capturing algorithm’s efficiency in practice
- it is important to use worst case runtime for analysis because you don't want to run into a situation where you don't have enough memory to perform the algorithm and you don't want people to be angry at you because the algorithm takes longer than promised
Natural “brute-force” algorithm for the Stable Matching Problem plows through all perfect matchings by enumeration, checking to see if its stable
- General extrapolation: “brute-force” plows through all possibilities of enumeration in solving the problem and checks to see if the answers meet the requirements of the problem
- brute force search algorithms typically have exponential runtime
The concept of a “reasonable running time”: what defines it?
Good algorithm: input size increase by a constant factor, the algorithm should only slow down by some constant factor C
Polynomial running time / polynomial-time algorithm: cN^d
problems for which polynomial-time algorithms exist almost always have algorithms with running times proportional to moderately growing polynomials (n, nlogn, n^2, n^3)
worst-case, polynomial-time bounds is only an abstraction of practical situations
An algorithm is efficient if it has a polynomial running time
- This definition is more of an absolute notion than alternatives
- Closely connected to the idea each problem has an intrinsic level of computational tractability: some admit efficient solutions, others do not

Questions

There was no true definition of computational tractability given in the section. As I understand it, computational tractable problem would be one that can be solved with an algorithm with polynomial runtime. Is this correct?
Are there any situations in which a brute force algorithm would be better? For instance, if you knew your data size was really small.
Is there a way to tell if you have an algorithm with the best possible polynomial runtime to solve a problem?

Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)

At first, I was confused on what the text meant by a “brute force” search algorithm. I had an idea of what it was, but after hearing the significance of comparing your algorithm to a brute for search algorithm, and making sure it performs better. Ergo, through that explanation I better understood what a brute force search algorithm was: an algorithm that checks every possible solution to the problem until it finds the right one, which has exponential runtime.

How Readable the Section was

On a scale of 1 to 10, this section was a solid 7 because I thought that the book did a good job of explaining why using worst case runtime is important, and why polynomial-time algorithms are ideal. However, I found the language to be unclear at times.

2.2: Asymptotic Order of Growth

Summary of the section

We typically express the worst case runtime of an algorithm on inputs of size n, which grows at a rate that is at most proportional to some function f(n). T is asymptotic upper bound by f when there exist constants c > 0 and n0 > 0 so that for all n>=n0, we have T(n) ⇐c*f(n). T is asymptotically lower bounded by f when there exist constants E > 0 and n0 >= 0 so that for all n>=n0, we have T(n) >= E * f(n). If a runtime is bound by both O(f(n)) and Ω(f(n)), then we can say f(n) is an asymptotically tight bound for T(n). Then, the text explained the properties of asymptotic growth rates, which include transitivity and sums of functions. Finally, the chapter ends with an explanation of the asymptotic growth rates of polynomials (T(n) is O(n^d) where d is the degree), logarithms(logb(n) = O(n^x) for b>1 and x>0), and exponentials (n^d = O(r^n) for r>1 and d>0).

Notes on the Section

Asymptotic Upper Bounds
- T(n) is O(f(n)) if there exist constants c > 0 and n0 > 0 so that for all n>=n0, we have T(n) ⇐c*f(n)  T is asymptotically upper bounded by f
- T(n) = pn^2 + qn + r⇐pn^2 + qn^2 + rn^2 = (p + q + r)n^2
- Exactly definition of O(•), which requires T(n)⇐cn^2, where c = p + q + r
Asymptotic Lower Bounds
- T(n) is at least a constant multiple of some specific function f(n)
- T(n) is Ω(f(n)) if there exist constants E > 0 and n0 >= 0 so that for all n>=n0, we have T(n) >= E * f(n)  T is asymptotically lower bounded by f
- Whereas, establishing the upper bound involved inflating the terms in T(n) until it looked like a constant times n^2, now we need to do the opposite: we need to reduce the size of T(n) until it looks like a constant times n^2
- T(n) = pn^2 + qn + r >= pn^2
  - Meets requirements by the definition of Ω(•)
Asymptotically Tight Bounds
- If we can show that a running time T(n) is both O(f(n)) and Ω(f(n)), then we’ve found the “right” bound: T(n) grows exactly like f(n) to within a constant factor
- If a function T(n) is both O(f(n)) and Ω(f(n)), we sat T(n) is Θ(f(n))  f(n) is an asymptotically tight bound for T(n)
- Asymptotically tight bounds characterize the worst case performance of an algorithm precisely up to constant factors
- If the ratio of functions f(n) and g(n) converges to a positive constant as n goes to infinity, then f(n)= Θ(f(n))
Properties of Asymptotic Growth Rates
- Transitivity
  - If f = O(g) and g = O(h), then f = O(h)
  - If f = Ω(g) and g = Ω(h), then f = Ω(h)
- Sums of functions
  - if we have an asymptotic upper bound that applies to each of the two functions f and g, then it applies to their sum
  - Suppose that f and g are two functions such that for some function h, we have f=O(h) and g=O(h). Then f + g = O(h).
  - Let k be a fixed constant, and let f1, f2, …, fk and h be functions such that fi = O(h) for all i. Then f1 + f2 +…+ fk = O(h).
  - Suppose that f and g are two functions (taking nonnegative values) such that g = O(f). Then f + g = Θ(f). In other words, f is an asymptotic right bound for the combined function f + g.

Polynomials
- Their asymptotic rate of growth is determined by their “high-order term” – the one that determines the degree
- Let f be a polynomial degree d, in which the coefficient ad is positive. Then f = O(n^d).  f = Ω(nd) so it follows that f = Θ(nd)
- Polynomial-time algorithm is one whose running time T(n) is O(n^d) for some constant d, where d is independent of the input size
- An algorithm can be polynomial time even if its running time is not written as n raised to some integer power
  - Ex: O(n^1.59), O(n^1/2), O(nlogn)
Logarithms
- For every base b, the function logbn is asymptotically bounded by every function of the form nx, even for (non-integer) values of x arbitrarily close to 0
  - For every b > 1 and every x > 0, we have logbn = O(n^x).
- loga(n) = Θ(logb(n))  the base of the algorithm is not important when writing bounds using asymptotic notation
Exponentials
- Much faster rates of growth
- For every r > 1 and every d > 0, we have n^d = O(r^n).
- “The running time of this algorithm is exponential”, without specifying which exponential function they have in mind
  - the running time grows at least as fast as some exponential function, and all exponentials grow so fast that we can effectively dismiss this algorithm without working out further details of the exact running time
logarithms grow more slowly than polynomials, and polynomials grow more slowly than exponentials

Questions

The book mentioned O(•) and Ω(•). What is the significance of the “•”? Is it because this type of asymptotic upper bound/lower bound is used so often?
Does the book not mention factorial runtimes because they would run so slowly that if your algorithm runs that way, then that would definitely not be an ideal situation in which you would keep it that way? If an algorithm has factorial runtime, at that point, would a brute-force search be better because its runtime is usually 2^n?

Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)

In class it definitely became clearer/easier to understand with how translate all of the symbols in the section into words, which made reading back over the notes I took in the section sink better because those symbols actually meant something to me. Moreover, I was unclear on what exactly it meant to be an asymptotic upper bound and an asymptotic lower bound. I understood that for upper bounds the f(n) would be greater than the T(n) and for lower bounds the f(n) would be less than the T(n). However, I was not sure if in order to be an asymptotic growth rate, the entire function f(n) would have to be greater than or less than T(n); they could never cross. However, after seeing the slide in class with the graph whose f(n) and T(n) would intersect multiple times, it became clear that was not the case. There just needs to be a point at which f(n) from then on is always greater than or always less than T(n).

How readable the section was

On a scale of 1 to 10, this section was a 6 on the first read because all of the symbols made it difficult to completely comprehend all of the information. However, after reading it again, the section became more of an 8 because once I got a hang of the notation and after going to class to hear the terms used out loud, my reading had more of a natural flow to it. Thus, the section was easier to read a second time.

2.3 Implementing the Stable Matching Algorithm

Summary

In order to implement the Gale-Shapely algorithm in O(n^2) runtime, we need to consider the data structures we need to use and how they will be manipulated, and, in turn, limit the number of computational steps. We only need two (of the simplest) data structures to implement the Gale-Shapely algorithm in O(n^2): arrays and lists. Linked lists are good for maintaining a dynamically changing set of data. However arrays are not good for sets of data that are constantly changing because resizing results in O(n) runtime. Now, for the implementation, the set of unmatched men will be kept track of as a list because the set is constantly changing as the algorithm is performed. Engagements will be kept track of in 2 arrays of length n. Who the men will propose to will be be tracked in an array of integers, indicating what position on m's list he will propose to next. Finally, the preference lists will be implemented as two 2D arrays. However, the structure of how the data is accessed is different in the women's array than in the men's array in order to maintain O(k) runtime with the operations within the while loop.

Outline of the Gale-Shapley Algorithm Implementation

The motivation is that we need to figure out what data structures we should use to implement the Gale-Shapely algorithm O(n^2) runtime.
To start off, the preference lists for each man and each woman will be represented as two arrays, one for women’s preference lists and one for men’s preference lists
Each of the following four things (which would be performed inside a while loop that executes O(n^2) times) need to be performed in constant time in order to maintain an overall O(n^2) runtime of the algorithm
- Identify a free man m.
- We need to be able to identify, for m, the highest-ranked woman to whom he has not yet proposed.
- We need to decide if w is currently engaged; if she is, identify her current partner.
- For w, m, and m’, we need to decide which of m or m’ is preferred by w.
selecting a free man
- We need to maintain the set of free men as a linked list. We should take the first man m in the list. Lastly, we need to delete m from the list if he becomes engaged and possibly insert a new man m’ if m’ becomes free as a result of the engagement of m.
- O(k)
identify highest ranked woman to whom m has not yet proposed
- We need to maintain another array that indicates for each m the position of the next woman he will proposed to on his preference list. We should initialize each cell to start with the value 1. Once m proposes to w, increment the value in the appropriate cell.
- O(k)
after m proposes to w, identify if w is engaged to an m’
- We will maintain another array of length n where each position is w’s current partner m’.Then, use a null symbol to indicate if a woman is not currently engaged. To initialize the data structure, we need to assign each woman a null symbol because everyone will be free at the start of the algorithm.
- O(k)
Determine which of m’ and m is preferred by w
- If we make the woman’s preferences array structured like the men’s it will result in O(n) runtime, which is not good, especially because we know we can do better. We are aiming for O(n^2). If we structure the women's preference array like the men's, we would end up with O(n^3) runtime. First, we should create an n x n array where each position contains the rank of m in the sorted order of w’s preferences. The total initial time investment of this step is O(n). To decide which of m’ and m is preferred by w, compare the values of the positions of m and m’ in w’s array.
- O(k)
Implementing the algorithm by using these data structures allows the Gale-Shapely algorithm to run in O(n^2) runtime overall.

Arrays vs. Lists

Arrays
- Access at the ith element: O(1)
- Search for element e in an array: O(n)
- Search for element e in a sorted array: O(logn) binary search
- An array is not as good for dynamically maintaining a list of elements that changes over time (resizing results in O(n) for adding elements if the array is full)
Linked Lists
- Doubly linked list make the list traversable in both directions
- Lists are good for maintaining a dynamically changing set
- Access to the ith element: O(i)
- Insertion and Deletion of element e: O(1)
- Search for element e in a linked list: O(n)
It is easy to convert between the array and list representations in O(n) time

Questions

If we consider the problem that we had on Problem Set #1, how would allowing ties change our data structures? Would the women's preference array would merely have to allow two men to have equal rankings? Would the men's array have to be a 3D array (an array of arrays of arrays) to accommodate ties?
Is there a way to implement the Gale-Shapely algorithm in O(n^2) runtime if we use objects to represent the men and women with the data of preference lists, etc. within the object?

Additional Information

After discussing the implementation of the algorithm in class with the various data structures, I had a hard time comprehending the women's preference list implementation because it was backwards of the implementation of the men's preference list. However, after reading the text, it became clear that you merely need to compare the position of man m to the position of man m' in w's array within the women's preference array, which would make the step of determine if w prefers m or m' constant time O(k). The women's preference array is merely the inverse of the man's preference array.

The section overall, I would give a solid 9 out of 10 on the readable scale. The progression from refreshing the reader on lists and arrays to using them to implement the Gale-Shapley algorithm was extremely helpful and definitely made the implementation section of the reading easy to comprehend. Additionally, the implementation section of the reading was broken up into four parts, which were the main actions within the while loop organized in the order that the steps would be visited. The only reason the section is not a 10 is because I had to re-read the women's presences list description twice to be able to fully comprehend its implementation.

2.4 Common Runtimes

Summary

Notes

Most problems have a natural “search space”. A unifying theme in algorithm design is the search for algorithms whose performance is more efficient than a brute-force enumeration of the search space. When we get a new problem, we have to think about two different bounds: the runtime we hope to achieve and the size of the problem’s natural search space (the runtime of a brute-force search algorithm).

Linear Time

Process the input in a single pass, spending a constant amount of time on each item of input encountered.
Constant work per element yields a runtime of O(n).
Example: computing the maximum in a set of numbers, constant work per element
Example: merging two sorted lists, if you look at the top card on each stack, you know that the smaller of these two should go first the output pile; so you could remove this card, place it on the output, and iterate on what’s left.
Usually with linear runtime algorithms, people typically describe it as constant work per element. But, that is not true. It is more like each iteration involves a constant amount of work, so the total runtime is O(n).

O(nlogn) Time

O(nlogn) is the runtime of any algorithm that splits its input into two equal-sized pieces, solves each piece recursively, and then combines the two solutions in linear time.
Sorting: The merge sort algorithm divides the set of input numbers into two equal-sized pieces, sorts each half recursively, and then merges the two sorted halves into a single sorted output list.
O(nlogn) algorithms are so common because, typically, the most expensive step is to sort the input.

Quadratic Time

Quadratic time arises naturally from a pair of nested loops: an algorithm consists of a loop with O(n) iterations, and each iteration of the loop launches an internal lop that takes O(n) time. When you multiply these together, you get O(n^2) runtime.
Example: find a pair of n points in a plane that are the closest together

Cubic Time

More elaborate nested loops lead to algorithms that run in O(n^3) time.
Example: want to know if some pair of the sets S1, S2,….., Sn of size n are disjoint
- Each of the sets has a maximum size of n, so the innermost loop takes time O(n). Looping over the sets Sj involves O(n) iterations are this innermost loop, and looping over the sets Si involves O(n) iterations around this. Multiply these three factors of n together, and you get O(n^3).

O(n^k) Time

We get a runtime of O(n^k) for any constant k when we search over all subsets of size k.
Example: find independent sets in a graph: The natural brute-force algorithm enumerates all subsets of k nodes, and for each subset S it would check whether there is an edge joining ant two members of S. The total runtime is O(k^2n^k) but because k is a constant we can drop it, so the total runtime is O(n^k).
Total number of k-element subsets in an n-element is C(n, k): n “choose” k

Beyond Polynomial Time

Two types of bounds come up frequently, which are 2^n and n!
2^n comes about naturally as a runtime for search algorithms that must consider all of its subsets
2^n is the natural size of a search space for many problems, and for many of them we can find efficient polynomial-time algorithms. But, sometimes you can by-pass the brute-force search algorithm and sometimes you can’t.
Search spaces of n! come about for two reasons: n! is the number of ways to match up m items with n other items (despite this we were able to solve the stable matching problem in O(n^2)) or the search space consists of all ways to arrange n items in order (Traveling Salesman Problem)

Sublinear Time

Smaller than linear
These situations happen when the input can be “queried” indirectly rather than read completely. The goal is to minimize the amount of querying that must be done.
Example: binary search
- We shrink the size of the active region by a factor of two with every probe. So, after k probes it has a size at most of ((1/2)^k)n.
- When k = log2(n) the size of the active region has reduced to a constant, the algorithm bottoms out and we can search the remainder of the array directly in constant time.
- Binary search is O(logn) because of the successive shrinking of the search region.
O(logn) happens when an algorithm does a constant amount of work to throw away a constant fraction of the input. O(logn) such iterations suffice to shrink the input down to constant size, at which point the problem can be solved directly.

Questions

Additional Information

In class, we didn't really talk about O(2^n) or O(n!) in too much depth. It was really interesting and helpful to see that in the reading. Until doing the reading, I wasn't quite sure why O(2^n) was such a common or important runtime, but now it is obvious why. It is the natural runtime of a brute-force search algorithm, which we try to improve upon in all of the algorithms that we create in this class. Moreover, the way by which a factorial runtime comes about was more explicitly explained in the reading than in class. Although, we would hope to never have an algorithm that runs this efficiently, it is still important to know how it comes about. Factorial runtime comes about as the number of ways to match up m items with n other items and the search space consists of all ways to arrange n items.

This section was a 9 out of ten in readability because the runtimes were explained very clearly. The only thing that got a little confusing was when they explained some of the examples. However, overall, the language was very precise and the common runtimes were explained in great detail.

2.5 Priority Queues

Summary

Notes

Main goal in the book is to seek algorithms that improve qualitatively on brute-force search. We use polynomial-time solvability as the concrete formulation of this. Once we have an efficient algorithm to solve a problem, it can be possible to find improvements by being mindful of the implementation, and by possibly using more complex data structures. The priority queue is one of the most broadly useful sophisticated data structures.

The Problem

The priority queue is designed for applications in which elements have a priority value (key), and each time we select an element from the set, we take the element with the highest priority.
Priority Queue
- Maintains a set of elements S, where each element v has an associated value key(v) that denotes the priority of the element v
- Smaller keys represent higher priorities
- Support addition and deletion of elements, and the selection of an element
Each of the processes has a priority but processes do not arrive in order of their priorities. We have a current set of active processes and we want to extract the one with the highest priority to run.
We want to be able to add, delete, and select the element with the smallest key in O(logn)
A sequence of O(n) priority queue operations can be used to sort a set of n numbers.
- Create a priority queue and insert each number with its value as the key. Extract the smallest number one by one until all numbers have be extracted.
- With a priority queue that can perform insertions and extraction of the element with the minimum key in O(logn), we can sort n numbers in O(nlogn) time.

A Data Structure for Implementing a Priority Queue

We will use a heap to implement the priority queue.
Using an unsorted list with a pointer to the minimum to implement the priority queue results in O(n) extraction of the minimum.
Using a sorted list with a pointer to the minimum to implement the priority queue results in O(n) addition of an element.
The heap combines the benefits of a sorted array and list. A heap is a balanced binary tree. The tree will have a root, and each node can have up to two children, a left and a right child.
Elements in a heap are said to be in heap order if the key of any element is at least as large as the key of the element at it parent node in the tree.
Heap order: For every element v, at a node i, the element w at i’s parent satisfies key(w) ⇐ key(v)
We can use points, each node of the heap can keep the element it stores, its key, and three pointers pointing to the two children and the parent of the heap node.
Or we can completely avoid using pointers by using an array H indexed i = 1,…,N if a bound N is known in advance. H[1] is the root and for any node at position i, the child nodes are at leftChild = 2i and rightChild = 2i+1

Implementing Heap Operations

The heap element with the smallest key is at the root, so it takes O(1) to identify the minimal element.
To add a new element we will use Heapify-up.

Heapify-Up

 Heapify-up(H, i):
    If i>1 then
       Let j = parent(i) = [i/2]
       If key[H[i]] < key[H[j]] then
          Swap the array entries H[i] and H[j]
          Heapify-up(H, j)
       Endif
    Endif

Heapiy-up fixes the heap property in O(logn) time assuming the array H is almost a heap with the key of H[i] too small. Using Heapify-up we can insert a new element in a heap of n elments in O(logn)
Be applying Heapify-up(j) recursively, we will produce a heap as required. This process dollow the tree-path from position i to the root, so it takes O(logn)
ExtratMin(H) requires the definition of another, more broad, process Delete(H, i), which will delete the element at position i.
After deleting the element at position i there will only be n-1 elements and a hole at H[i] so we will need to employ the process Heapify-down to fill the hole and maintain the heap-order.

Heapify-Down

 Heapify-down(H, i):
    Let n = length(H)
    If 2i > n then
       Terminate with H unchanged
    Else if 2i < n then
       Let left = 2i, and right = 2i+1
       Let j be the index that minimizes key[H[left]] and key[H[right]]
    Else if 2i = n then
       Let j = 2i
    Endif
    If key[H[j]] < H[i]] then
       Swap the array entries H[i] and H[j]
       Heapify-down(H, j)
    Endif

The process Heapify-down(H, i) fixes the heap property in O(logn) time assuming that H is almost a heap with the key value of H[i] too big. Using Heapify-up or Heapify-down we can delete a new element in a heap of elements in O(logn) time
Swapping the elements w and v takes O(1) time
The algorithm repeatedly swap the element originally at position i down following a tree-path, so in O(logn) iterations the process results in a heap.

Implementing Priority Queues with Heaps

StartHeap(N): O(N)
Insert(H, v): O(logn)
FindMin(H): O(1)
Delete(H, i): O(logn)
ExtractMin(H): a combination of FindMin and Delete; O(logn)
Delete(H, Position[v]: delete an element v; O(logn)
ChangeKey(H, v, a): changes the key value of element v to key(v) = a; O(logn)
- Would need to apply Heapify-up or Heapify-down as a result of the change in key

Questions

Additional Information

After going to class and having Professor Sprinkle really break down the components of Heapify-down, what exactly was happening in that algorithm became way more clear. She really emphasize that the three different if/elif statements were for the cases of how many children a node had. If 2i > n then meant that an element had no children. Else if 2i < n then meant that an element had two children. Finally, Else if 2i = n then meant that an element had one child. The reason for having Heapify-down also became more clear after going to class. It is now apparent that Heapify-down is there to fill the hole that deleting an element results in whilst maintaining the heap order.

This section's readability was an 8 out of 10. The progression from the definitions of priority queues and heaps to the ultimate explanation was very precise and natural. The only reason that it wasn't a 10 is because it took me a while to fully grasp the concepts of Heapify-up and Heapify-down. Ultimately, the section used very clear language and precise explanations of priority queues and heaps, and it included a lot of definitions and explanations of properties of different data structures, which were super helpful in going back over my notes.

Table of Contents

Chapter 2

2.1: Computational Tractability

Summary of the Section

Notes on the Section

Questions

Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)

How Readable the Section was

2.2: Asymptotic Order of Growth

Summary of the section

Notes on the Section

Questions

Discuss anything that makes more sense after reading it again, after it was presented in class (or vice versa)

How readable the section was

2.3 Implementing the Stable Matching Algorithm

Summary

Outline of the Gale-Shapley Algorithm Implementation

Arrays vs. Lists

Questions

Additional Information

2.4 Common Runtimes

Summary

Notes

Questions

Additional Information

2.5 Priority Queues

Summary

Notes

Questions

Additional Information