Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| courses:cs211:winter2018:journals:hornsbym:chapter_4 [2018/02/27 02:46] – created hornsbym | courses:cs211:winter2018:journals:hornsbym:chapter_4 [2018/03/12 02:59] (current) – [Section 4.8 (Huffman Codes)] hornsbym | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Chapter 4 ====== | ====== Chapter 4 ====== | ||
| - | ===== Section 4.1 (Interval Scheduling) ==== | + | ===== Section 4.1 (Interval Scheduling) |
| - | + | This section deals with the Interval Scheduling problem, and implements a greedy algorithm to solve it.\\ | |
| - | ===== Section 4.2 ===== | + | \\ |
| - | + | The interval scheduling problem deals with scheduling job requests. We want to schedule as many jobs as possible, but to do so we need to pick only jobs that do not overlap. Each given job has a beginning and end time.\\ | |
| - | ===== Section 4.4 ===== | + | \\ |
| - | + | The solution to this problem is deceptively simple: we start with the request that has the earliest end time possible. Then, we delete all requests that conflict with this accepted request. Then we choose the job that has the next earliest end time. We repeat this algorithm until there are not more requests to either delete or accept, and the accepted jobs are the optimal set. It is important to distinguish that this algorithm works because all jobs have the same value, so our only concern is taking on the highest number of jobs.\\ | |
| + | \\ | ||
| + | This section then goes on to prove that this algorithm is optimal. First, it proves by induction that no two accepted jobs overlap (// | ||
| + | \\ | ||
| + | This section also describes the Interval Partitioning problem, which is similar to the Interval Scheduling problem. This algorithm mostly operates under the same processes described above, except when two jobs overlap, the conflicting jobs are put onto different " | ||
| + | ===== Section 4.2 (Minimize Lateness) ===== | ||
| + | For this problem, we have a single resource at our disposal (like in the Interval Scheduling problem). All jobs have deadlines instead of start and end times, and requires a certain amount of unbroken time to complete the job. Therefore, each job must be scheduled in non-overlapping intervals. We try to minimize the amount of time any job is submitted past its deadline (lateness).\\ | ||
| + | \\ | ||
| + | To solve this problem with a greedy algorithm, we must simply sort the jobs in order of their deadlines. The intuition is to make sure jobs with earlier deadlines get completed sooner than jobs with later deadlines. \\ | ||
| + | \\ | ||
| + | This section proves that the greedy algorithm is optimal by proving four essential components of the problem. First, we prove that there is an optimal schedule with no idle time. Next, we prove that all schedules with no inversions (a job with an earlier deadline being scheduled AFTER a job with a later one) and no idle time have the same maximum lateness. We prove this via a direct proof stating that the ordering of jobs with the same deadline does not determine lateness. Then, we prove that there is an OPTIMAL schedule with no inversions and no idle time. We do this by direct proof. To conclude the section, we prove that the schedule produced by our greedy algorithm has the optimal maximum lateness. This proof builds off the proofs we completed above: because we proved that there is an optimal schedule with no inversions, and that all schedules with no inversions have the same maximum lateness, the schedule our algorithm returns MUST be optimal. | ||
| + | ===== Section 4.4 (Shortest Path) ===== | ||
| + | This section uses greedy algorithms to solve the Shortest Path problem. We are given a starting node //s//, and a graph of nodes and vertices. The vertices of this graph include a value designating the cost of traversing that node. Our goal is to traverse from //s// to another given node //t//.\\ | ||
| + | \\ | ||
| + | Edsger Dijkstra proposed a greedy solution to the Shortest Path problem in 1959. Basically, this algorithm finds the shortest path from //s// to each other node in the graph, not just //t//. Then, it is easy to determine the overall least-costly route from //s// to //t//. \\ | ||
| + | \\ | ||
| + | The specific algorithm is as follows: first, we find the values of all edges connected to //s//. Then, we construct a priority queue of these nodes, organized from least costly path to most costly. Then, we explore all edges connected to the highest priority on the queue, all the while keeping track of the cost to get to each individual node. As we find less costly paths, we update the cost to get to that node. After all shortest paths have been found, we can easily return a list of the shortest path from //s// to any end node //t//.\\ | ||
| + | \\ | ||
| + | By using a heap-based priority queue, this algorithm can run in O(//m// log //n//) time. | ||
| + | ===== Section 4.5 (Minimum Spanning Tree) ===== | ||
| + | This section deals with minimum spanning trees and the various algorithms that can produce one.\\ | ||
| + | \\ | ||
| + | The first algorithm is called // | ||
| + | \\ | ||
| + | The second algorithm is called // | ||
| + | \\ | ||
| + | The final algorithm resembles // | ||
| + | \\ | ||
| + | The section then proves correctness and optimality of the algorithms, which can be found on pages 144-149.\\ | ||
| + | \\ | ||
| + | The section concludes with the implementation and runtime of //Kruskal// and // | ||
| + | ===== Section 4.6 (Union-Find Structure) ===== | ||
| + | This section describes the Union-Find structure and its uses. It is well suite to implementing algorithms similar to // | ||
| + | \\ | ||
| + | The Union-Find structure only does exactly what its name implies: union and find operations. It does this by keeping track of disjointed sets (or graphs, in the case of // | ||
| + | \\ | ||
| + | An array-based implementation can be used to run the find operation in O(1) runtime and union in O(n) runtime. This is efficient enough to use on // | ||
| + | \\ | ||
| + | ===== Section 4.7 (Clustering) ===== | ||
| + | ===== Section 4.8 (Huffman Codes) ===== | ||
| + | Huffman codes compress data. Computers operate by producing and reading bits, which are sequences of 0's and 1's. Each sequence is assigned some piece of information that the computer can understand, so the problem is centered around picking the most efficient way of assigning sequences to information so that the most used information is assigned to the least costly sequence. Huffman codes do that exactly.\\ | ||
| + | \\ | ||
| + | Huffman codes use trees to organize and locate data. All data are stored in the leaf nodes of the tree, with the parent nodes being empty. As the computer reads through each bit, it traverses through the tree until it lands on a leaf node. Then, it returns that datum and reads through the next bit. It assigns O and 1 to a traversal to either the left or right child node, which guarantees that no two data will have the same encoding.\\ | ||
| + | \\ | ||
| + | What makes this algorithm work efficiently is that the trees are not always full. The most common data will be placed near the top of the tree, so that they are reached first. The book uses commonly used letters as an example. It would not makes sense for the letter ' | ||
| + | \\ | ||
| + | If we use a heap priority queue to implement Huffman' | ||
