====== Chapter 4 ======

My notes on Chapter 4 readings

===== Intro & 4.1: Interval Scheduling =====

    * A greedy algorithm uses a sum of local optimizations to make a global optimization.
    * We can show optimality by showing that greedy stays ahead or by making an exchange argument to show that the greedy solution and the optimal solution are equivalent.
    * We can find a greedy algorithm to solve interval scheduling problems
        * We solve this by taking all requests, and, while there still exist requests to be processed, choose a request W with the smallest finishing time, add it to our schedule, then delete from the requests all requests incompatible with W.
    * We can complete this in O(n*log(n)) time
    * We can extend this to fulfill interval partitioning, in which the number of resources needed is at least the //depth// of the set of intervals - that is, the number of intervals overlapping.
        * We can solve this by sorting intervals by start time, then for each interval, either placing it in the first resource available for it, or allotting a new resource if none is available.
    * This section wasn't very interesting to read. 6/10.

===== 4.2: Minimizing Lateness =====

    * Say we want to schedule jobs with start and end times and deadlines so that the maximum lateness for any job is minimized.
    * We can't always do what we've done for the previous jobs, because in that example we just did as much as we could and here we //have to do everything//.
    * We can't do jobs in increasing order of length, or of slack time. Instead, we order by deadline!
    * We prove this by exchange argument: any optimal solution that isn't the greedy solution must have some sort of inversions! We un-invert those, get rid of lack time, then we have our greedy solution. 
    * Not that interesting a section, kind of overly wordy. 5/10.

===== 4.4: Shortest Paths =====

    * We want to find the shortest path in a graph, either from some node //s// to some other node //t// or from //s// to every other node.
    * We can use Dijkstra's algorithm to do this for directed graphs:
        * Explore nodes by selecting the next (univisited) node //t// as that with the next lowest-cost distance from //s// by way of an already discovered node.
        * Add //t// to the discovered set.
    * Using a priority queue, Dijkstra's algorithm runs in O(m) time for m edges.
    * 6/10.

===== 4.5: Minimum Spanning Trees =====

    * A minimum spanning tree is a subset of a graph G such that the new graph G' is connected as cheaply as possible.
    * Such a tree will necessarily be a tree.
    * Three algorithms to build a minimum spanning tree consist of doing the following:
        * Adding edges to the MST in increasing order of cost, provided the next cheapest edge does not form a cycle,
        * Beginning at vertex //s//, adding the next cheapest edge,
        * Begin with the original G and, in decreasing order of edge cost, delete edges provided the current edge to be deleted will not make the spanning tree disconnected.
    * These three algorithms are Kruskal's, Prim's, and the Reverse-Delete algorithms, respectively.
    * We can implement Prim's algorithm in O(mlogn) time.
    * This section wasn't interesting. 4/10.

===== 4.6: Kruskal's Algorithm =====

    * We can implement Kruskal using a pointer-based structure to support creation, find, and union for sets.
    * With this structure we can run Kruskal in O(m log n) time
    * Not that interesting, 5/10

===== 4.7: Clustering =====

    * We can use clustering when we want to group objects based on their similarity.
    * In order to find a clustering with maximum spacing (objects in the different groups are as far apart as possible), we use Kruskal but stop before we add the last k-1 edges where k is the number of connected components in G.
    * I had a phone interview with a clustering question and I had no idea what to do. This section is a 10/10 because Epic pays really well and asks about clustering in interviews.

===== 4.8: Huffman Codes & Data Compression =====

    * Data encoding problems all relate to how we can take a problem and make it recursive.
    * We can apply data compression to encoding symbols in bits, encoding finding prefix codes (mapping letters to bit strings 1:1), etc.
    * We can represent prefix codes by using binary trees: 
        * A binary tree whose number of leaves is the size of our alphabet, and we label leaves with distinct letters of the alphabet
            * By selecting a letter s in our alphabet, we follow the path from the root to the 's' leaf, and for each time we go to the left child we write a 0, and for the right we write a 1.
    * Huffman's algorithm takes this a step farther, saying that if the alphabet S has more than two letters, we form an alphabet S' excluding the two lowest-frequency letters, replacing them with a new letter. We construct a prefix code //c// for S', and define a prefix code for S beginning with the tree used to find //c//.
    * This algorithm uses the minimum needed number of bits per letter for a prefix code, and can run in O(k^2) time. If we use priority queues, we can get O(k logk) time.
    * I know this should have been interesting, but since I missed class I'm just basing my opinion on the book which was not interesting. 4/10.