Table of Contents
Chapter 4
My notes on Chapter 4 readings
Intro & 4.1: Interval Scheduling
- A greedy algorithm uses a sum of local optimizations to make a global optimization.
- We can show optimality by showing that greedy stays ahead or by making an exchange argument to show that the greedy solution and the optimal solution are equivalent.
- We can find a greedy algorithm to solve interval scheduling problems
- We solve this by taking all requests, and, while there still exist requests to be processed, choose a request W with the smallest finishing time, add it to our schedule, then delete from the requests all requests incompatible with W.
- We can complete this in O(n*log(n)) time
- We can extend this to fulfill interval partitioning, in which the number of resources needed is at least the depth of the set of intervals - that is, the number of intervals overlapping.
- We can solve this by sorting intervals by start time, then for each interval, either placing it in the first resource available for it, or allotting a new resource if none is available.
- This section wasn't very interesting to read. 6/10.
4.2: Minimizing Lateness
- Say we want to schedule jobs with start and end times and deadlines so that the maximum lateness for any job is minimized.
- We can't always do what we've done for the previous jobs, because in that example we just did as much as we could and here we have to do everything.
- We can't do jobs in increasing order of length, or of slack time. Instead, we order by deadline!
- We prove this by exchange argument: any optimal solution that isn't the greedy solution must have some sort of inversions! We un-invert those, get rid of lack time, then we have our greedy solution.
- Not that interesting a section, kind of overly wordy. 5/10.
4.4: Shortest Paths
- We want to find the shortest path in a graph, either from some node s to some other node t or from s to every other node.
- We can use Dijkstra's algorithm to do this for directed graphs:
- Explore nodes by selecting the next (univisited) node t as that with the next lowest-cost distance from s by way of an already discovered node.
- Add t to the discovered set.
- Using a priority queue, Dijkstra's algorithm runs in O(m) time for m edges.
- 6/10.
4.5: Minimum Spanning Trees
- A minimum spanning tree is a subset of a graph G such that the new graph G' is connected as cheaply as possible.
- Such a tree will necessarily be a tree.
- Three algorithms to build a minimum spanning tree consist of doing the following:
- Adding edges to the MST in increasing order of cost, provided the next cheapest edge does not form a cycle,
- Beginning at vertex s, adding the next cheapest edge,
- Begin with the original G and, in decreasing order of edge cost, delete edges provided the current edge to be deleted will not make the spanning tree disconnected.
- These three algorithms are Kruskal's, Prim's, and the Reverse-Delete algorithms, respectively.
- We can implement Prim's algorithm in O(mlogn) time.
- This section wasn't interesting. 4/10.
4.6: Kruskal's Algorithm
- We can implement Kruskal using a pointer-based structure to support creation, find, and union for sets.
- With this structure we can run Kruskal in O(m log n) time
- Not that interesting, 5/10
4.7: Clustering
- We can use clustering when we want to group objects based on their similarity.
- In order to find a clustering with maximum spacing (objects in the different groups are as far apart as possible), we use Kruskal but stop before we add the last k-1 edges where k is the number of connected components in G.
- I had a phone interview with a clustering question and I had no idea what to do. This section is a 10/10 because Epic pays really well and asks about clustering in interviews.
4.8: Huffman Codes & Data Compression
- Data encoding problems all relate to how we can take a problem and make it recursive.
- We can apply data compression to encoding symbols in bits, encoding finding prefix codes (mapping letters to bit strings 1:1), etc.
- We can represent prefix codes by using binary trees:
- A binary tree whose number of leaves is the size of our alphabet, and we label leaves with distinct letters of the alphabet
- By selecting a letter s in our alphabet, we follow the path from the root to the 's' leaf, and for each time we go to the left child we write a 0, and for the right we write a 1.
- Huffman's algorithm takes this a step farther, saying that if the alphabet S has more than two letters, we form an alphabet S' excluding the two lowest-frequency letters, replacing them with a new letter. We construct a prefix code c for S', and define a prefix code for S beginning with the tree used to find c.
- This algorithm uses the minimum needed number of bits per letter for a prefix code, and can run in O(k^2) time. If we use priority queues, we can get O(k logk) time.
- I know this should have been interesting, but since I missed class I'm just basing my opinion on the book which was not interesting. 4/10.