====== Chapter 4 ====== My notes on Chapter 4 readings ===== Intro & 4.1: Interval Scheduling ===== * A greedy algorithm uses a sum of local optimizations to make a global optimization. * We can show optimality by showing that greedy stays ahead or by making an exchange argument to show that the greedy solution and the optimal solution are equivalent. * We can find a greedy algorithm to solve interval scheduling problems * We solve this by taking all requests, and, while there still exist requests to be processed, choose a request W with the smallest finishing time, add it to our schedule, then delete from the requests all requests incompatible with W. * We can complete this in O(n*log(n)) time * We can extend this to fulfill interval partitioning, in which the number of resources needed is at least the //depth// of the set of intervals - that is, the number of intervals overlapping. * We can solve this by sorting intervals by start time, then for each interval, either placing it in the first resource available for it, or allotting a new resource if none is available. * This section wasn't very interesting to read. 6/10. ===== 4.2: Minimizing Lateness ===== * Say we want to schedule jobs with start and end times and deadlines so that the maximum lateness for any job is minimized. * We can't always do what we've done for the previous jobs, because in that example we just did as much as we could and here we //have to do everything//. * We can't do jobs in increasing order of length, or of slack time. Instead, we order by deadline! * We prove this by exchange argument: any optimal solution that isn't the greedy solution must have some sort of inversions! We un-invert those, get rid of lack time, then we have our greedy solution. * Not that interesting a section, kind of overly wordy. 5/10. ===== 4.4: Shortest Paths ===== * We want to find the shortest path in a graph, either from some node //s// to some other node //t// or from //s// to every other node. * We can use Dijkstra's algorithm to do this for directed graphs: * Explore nodes by selecting the next (univisited) node //t// as that with the next lowest-cost distance from //s// by way of an already discovered node. * Add //t// to the discovered set. * Using a priority queue, Dijkstra's algorithm runs in O(m) time for m edges. * 6/10. ===== 4.5: Minimum Spanning Trees ===== * A minimum spanning tree is a subset of a graph G such that the new graph G' is connected as cheaply as possible. * Such a tree will necessarily be a tree. * Three algorithms to build a minimum spanning tree consist of doing the following: * Adding edges to the MST in increasing order of cost, provided the next cheapest edge does not form a cycle, * Beginning at vertex //s//, adding the next cheapest edge, * Begin with the original G and, in decreasing order of edge cost, delete edges provided the current edge to be deleted will not make the spanning tree disconnected. * These three algorithms are Kruskal's, Prim's, and the Reverse-Delete algorithms, respectively. * We can implement Prim's algorithm in O(mlogn) time. * This section wasn't interesting. 4/10. ===== 4.6: Kruskal's Algorithm ===== * We can implement Kruskal using a pointer-based structure to support creation, find, and union for sets. * With this structure we can run Kruskal in O(m log n) time * Not that interesting, 5/10 ===== 4.7: Clustering ===== * We can use clustering when we want to group objects based on their similarity. * In order to find a clustering with maximum spacing (objects in the different groups are as far apart as possible), we use Kruskal but stop before we add the last k-1 edges where k is the number of connected components in G. * I had a phone interview with a clustering question and I had no idea what to do. This section is a 10/10 because Epic pays really well and asks about clustering in interviews. ===== 4.8: Huffman Codes & Data Compression ===== * Data encoding problems all relate to how we can take a problem and make it recursive. * We can apply data compression to encoding symbols in bits, encoding finding prefix codes (mapping letters to bit strings 1:1), etc. * We can represent prefix codes by using binary trees: * A binary tree whose number of leaves is the size of our alphabet, and we label leaves with distinct letters of the alphabet * By selecting a letter s in our alphabet, we follow the path from the root to the 's' leaf, and for each time we go to the left child we write a 0, and for the right we write a 1. * Huffman's algorithm takes this a step farther, saying that if the alphabet S has more than two letters, we form an alphabet S' excluding the two lowest-frequency letters, replacing them with a new letter. We construct a prefix code //c// for S', and define a prefix code for S beginning with the tree used to find //c//. * This algorithm uses the minimum needed number of bits per letter for a prefix code, and can run in O(k^2) time. If we use priority queues, we can get O(k logk) time. * I know this should have been interesting, but since I missed class I'm just basing my opinion on the book which was not interesting. 4/10.