Differences
This shows you the differences between two versions of the page.
| Next revision | Previous revision | ||
| courses:cs211:winter2018:journals:devlinn:chapter4 [2018/02/09 15:18] – created devlinn | courses:cs211:winter2018:journals:devlinn:chapter4 [2018/03/12 21:00] (current) – devlinn | ||
|---|---|---|---|
| Line 1: | Line 1: | ||
| ====== Chapter 4: Greedy Algorithms ====== | ====== Chapter 4: Greedy Algorithms ====== | ||
| - | ===== 4.1 Interval Scheduling: The Greedy Algorithm Stays Ahead ==== | + | ===== 4.1 Interval Scheduling: The Greedy Algorithm Stays Ahead ===== |
| + | In the interval scheduling problem, we considered a set of requests {1, 2, ..., n} where the i< | ||
| + | Initially let R be the set of requests and let A be empty | ||
| + | While R is not empty | ||
| + | Choose a request i ∈ R that has the smallest finishing time | ||
| + | Add request i to A | ||
| + | Delete all requests from R that are not compatible with request i | ||
| + | Endwhile | ||
| + | Return set A as the set of accepted requests | ||
| + | |||
| + | We want to prove that this solution is optimal, and we know that //A// is a compatible set of requests. To prove that greedy "stays ahead" we will set //O// to be the optimal set of intervals, and we want to show that //A// contains the same number of intervals as //O//. If we let // | ||
| + | |||
| + | There are many variations and extensions of the Interval Scheduling Problem, for example the //Interval Partitioning Problem//, where there are many identical resources which we must schedule all the requests to use as few of as possible. The algorithm for this problem is seen here: | ||
| + | Sort the intervals by start times, breaking ties arbitrarily | ||
| + | Let I1, ..., In denote intervals in this order | ||
| + | For j = 1, 2, 3, ..., n | ||
| + | For each interval Ii that precedes Ij in sorted order and overlaps it | ||
| + | Exclude the label of Ii from consideration for Ij | ||
| + | Endfor | ||
| + | If there is any label from {1, 2, 3, ..., d} that has not been excluded then | ||
| + | Assign a nonexcluded label to Ij | ||
| + | Else leave Ij unlabeled | ||
| + | Endfor | ||
| + | |||
| + | This section was very clear and understandable since we have covered it in class already. We followed the logic that the book followed so it was clear what was going on while I was reading. I would rate this section a 9 for readability. | ||
| + | |||
| + | |||
| + | ===== 4.2 Scheduling to Minimize Lateness: An Exchange Argument ===== | ||
| + | Given a single resource and a set of //n// requests to use the resource for an interval of time, with each request having a specific deadline, we want to find a way to minimize the lateness, which results from missed or delayed deadlines. This algorithm differs from the algorithms seen in 4.1 since we must find a start and finish time for each request. To find the optimal solution for this problem, we organize the requests by //Earliest Deadline First//, or in an increasing order of the jobs' deadlines. Here is the algorithm: | ||
| + | Order jobs by their deadlines | ||
| + | Assume d1 ≤ ... ≤ dn | ||
| + | Initially, f = s | ||
| + | Consider the jobs i = 1, 2, ..., n in this order | ||
| + | Assign job i to the time interval from s(i) = f to f(i) = f + ti | ||
| + | Let f = f + ti | ||
| + | End | ||
| + | Return the set of scheduled intervals [s(i), f(i)] for i = 1, 2, ..., n | ||
| + | |||
| + | We would like to prove that this algorithm produces an optimal schedule. Rather than use a proof that greedy stays ahead, we will modify //O//, the optimal solution, while preserving its optimality, to make it identical to our solution //A//. This is called the //exchange argument//. To do this, we must declare that //A'// has an // | ||
| + | - If O has an inversion, then there is a pair of jobs i and j such that j is scheduled immediately after i and has dj < di | ||
| + | - After swapping i and j we get a schedule with one less inversion | ||
| + | - The new swapped schedule has a maximum lateness no larger than that of O | ||
| + | Therefore, since A and O are schedules with no idle time and no inversions, the schedule we got from the greedy algorithm is optimal. | ||
| + | |||
| + | This section was confusing for me. I understand what we are trying to argue (that A is optimal) and how we are trying to argue it (by adjusting O so that it is the same as A) but I struggle to understand how we can claim that if a schedule has no idle time or inversions then it is optimal. I do not think that the book explained this section very well, though it is clearer to me now than it was during class. I would rate it a 6.5. | ||
| + | |||
| + | ===== 4.4 Shortest Paths in a Graph ===== | ||
| + | We can apply greedy algorithm practices to graphs in order to find the shortest paths in graphs. When looking at graphs, it can be helpful to know what the shortest path between two given nodes is or what the shortest path from a given node to all other nodes is. Dijkstra' | ||
| + | Dijkstra' | ||
| + | Let S be the set of explored nodes | ||
| + | For each u ∈ S, store a distance d(u) | ||
| + | Initially S = {s} and d(s) = 0 | ||
| + | While S ≠ V | ||
| + | Select a node v not in S with at least one edge from S for which d'(v) is as small as possible | ||
| + | Add v to S and define d(v) = d' | ||
| + | Endwhile | ||
| + | We can use the greedy stays ahead method to prove the correctness of this algorithm. If implemented using a priority queue, Dijkstra' | ||
| + | |||
| + | ===== 4.5 The Minimum Spanning Tree Problem ===== | ||
| + | Say we aim to build a connected communication network, using a set of locations //V// = {// | ||
| + | |||
| + | There are three greedy algorithms which can be used to solve this problem: Kruskal' | ||
| + | Assume that all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e = (v, w) be the minimum-cost edge with one end in S and the other in V - S. Then every minimum spanning tree contains the edge e. | ||
| + | |||
| + | We can use the Cut Property to prove that Kruskal' | ||
| + | Assume that all edge costs are distinct. Let C be any cycle in G, and let edge e = (v, w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G. | ||
| + | |||
| + | The Cycle Property is used to prove the optimality of the Reverse-Delete Algorithm. If using a priority queue, Prim's Algorithm can be implemented to run in O(//m//) time, plus the time for n ExtractMin and m ChangeKey operations. I understood this section alright. I would have liked to understand more simply how to Cycle and Cut Properties are implemented to show these things, because the way the book tried to explain this was unclear. I would rate it a 6. | ||
| + | |||
| + | ===== 4.6 Implementing Kruskal' | ||
| + | We would like to consider a graph evolving through the addition of edges. To maintain a set of connected components as we add to the graph, without having to recompute this information each time, we can implement the **Union-Find** data structure. This structure stores a representation of the components in such a way to support efficient searching and updating. The operation Find(// | ||
| + | |||
| + | The simplest way to implement a Union-Find data structure is using an array which will contain the name of the set currently containing each element. Using this implementation, | ||
| + | |||
| + | To improve upon this implementation, | ||
| + | |||
| + | ===== 4.7 Clustering ===== | ||
| + | Minimum spanning trees play a role in // | ||
| + | The components C1, C2, ..., Ck formed by deleting the k -1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing. | ||
| + | |||
| + | This section was also a bit challenging to understand. We started to discuss this in class. I would have liked more visuals in the book to guide me through the maximum clustering problem. I would rate it a 6. | ||
| + | |||
| + | ===== 4.8 Huffman Codes and Data Compression ===== | ||
| + | For this problem (data compression), | ||
| + | |||
| + | Another thing to consider when creating an optimal encoding of symbols is how frequently certain letters appear in everyday language. It is preferable to assign these letters a smaller number of bits to minimize our use of bits. To do this, we assign a frequency to every letter, with the total frequency of all letters summing to 1. | ||
| + | |||
| + | To construct a greedy algorithm for this problem, we will use binary trees. This ensures that our encoding constructed from the binary tree T will be a prefix code. In addition, the binary tree which corresponds to the //optimal// prefix code is full. Here is the algorithm, // | ||
| + | To construct a prefix code for alphabet S, with given frequencies: | ||
| + | If S has two letters then | ||
| + | Encode one letter using 0 and the other using 1 | ||
| + | Else | ||
| + | Let y* and z* be the two lowest-frequency letters | ||
| + | Form a new alphabet S' by deleting y* and z* and replacing them with a new letter w of frequency fy* + fz* | ||
| + | Recursively construct a prefix code y' for S', with tree T' | ||
| + | Define a prefix code for S as follows: | ||
| + | Start with T' | ||
| + | Take the leaf labeled w and add two children below it labeled y* and z* | ||
| + | |||
| + | Endif | ||
| + | |||
| + | When implemented using a priority queue, Huffman' | ||
