We have to think about how much time and how much space an algorithm will use as the size of its input(s) increases.
Many problems involve searching through a large number of possibilities, and the algorithm is supposed to find a solution more quickly than just searching through all theoretical possibilities. A basic definition of efficiency is as follows: “An algorithm is efficient if, when implemented, it runs quickly on real input instances.” This obviously leads to questions like how fast is “quickly” and how do we define “real input instances”, especially if the data set could be scaled up significantly. Worst-case running time is the longest possible running time of an algorithm and generally provides a good reference as to the efficiency of the algorithm, even though some specific algorithms may almost never reach their longest possible running time. With polynomial time algorithms, an increase in input size should increase the runtime by some constant factor.
The worst-case performance of an algorithm can be expressed as a function of the size of the inputs. The algorithms in the book will be written in pseudo-code so that when we count the number of steps, we are not actually counting the number of operations on the CPU. I think that could be architecture-dependent. The upper bound of of the running time of an algorithm is a constant multiple of its highest-order term (or an even higher order term). The lower bound of the running time of an algorithm is a constant multiple of its higher-order term or one of lower order. A tight bound occurs when we choose the highest order term of the polynomial, instead of a higher order for an upper bound or a lower order for a lower bound.
The implementation of an algorithm is dependent on the choice of data structures. Arrays have fast query times, that is, it is O(1) to find the ith element in an array. However, searching for a particular element in an array is O(N) if the array is not sorted. Adding and deleting items in the middle of a linked list is O(1) if we already have the element we want to insert before or after. Looking up the ith element in a linked list is O(N) because we have to traverse the list.
The book gives a good implementation for the Stable Matching Problem. We can have two arrays, one for all men and the other for all women. We can store each person's preference list in an array where the ith element is the ith preference for that particular person. All free men are stored in a linked list, so getting a free man, adding a free man, and deleting a man are all O(1). There is also an array that contains the next person to which every man would proposed, based on the index of the man in the first array that holds all men. Every man's element in the “next” array gets incremented when he proposes to that numbered woman. There is also an array that has the proposal status of each woman. To make the comparison of rankings for women easier, we can use a 2-D array so that this comparison consists of only two lookups.
I give this section a readability of 8/10.
The search space contains all possible solutions to a problem and a brute force search simply looks through all of these solutions to find the one that matches the problem's requirements. We try to use algorithms that are more efficient that a brute force search. The merging of two sorted lists into a final sorted list is O(N). Sorting problems are often O(N*LogN). Quadratic time algorithms are often the result of nested loops, and cubic time algorithms are similar but have another loop. If we call the number of loops “k” then we get algorithms with O(Nk). For example, if we look for all subsets of size “k” in a set it would be O(Nk). If instead of looking for a set of size “k” we are looking for the largest independent set, the algorithm becomes O(N22N). Algorithms that take less than O(N) time but more than constant time can arise when we have a set of input that does not all need to be read.
I give this section a readability of 5/10.
An efficient algorithm can be improved by using a different kind of data structure. In a priority queue, all elements have an associated key or value, and the lower the value, the higher the priority. Operating systems use priority queues to determine which process gets to use the CPU. In a priority queue, an element can be added, deleted, and the highest priority item can be selected in O(logN) time. Priority queues can be implemented using a heap, which is visually like a “balanced” binary tree, but stored in memory as an array. The key/value of every element is be at least as large (higher priority) as its parent. A node's at array position i has its left child at array position 2i and right child at 2i+1. Adding a new element to the heap is O(logN) because we must Heapify-up the element after adding it to the end of the array. This implies we keep track of the number of elements in our heap, or we would need to iterate through the array every time. The highest priority item is simply the first element in the array, but to remove it, we need to fill in its gap with the last element in the array. Then we Heapify-down this element to maintain the heap, which takes O(logN) time.
I give this section a readability of 7/10.