Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2011:journals:chen:chapter_4 [2011/02/16 05:07] – created zhongc
+++ courses:cs211:winter2011:journals:chen:chapter_4 [2011/03/02 03:13] (current) – zhongc
@@ Line 11: / Line 11: @@
 ===== 4.1 Interval Scheduling: The Greedy Algorithm Stays Ahead =====
 Section Summary:
+Problem Motivation:
+We have a set of requests {1, 2 ..... n}; the ith request corresponds to an interval of time starting at s(i)
+and finishing at f(i).
+We’ll say that a subset of the requests is **compatible** if no two of them overlap in time,
+and our **goal** is to accept as large a compatible subset as possible. Compatible sets of maximum size will be called **optimaL**.
+  * Most Natural attempt:  select the available request that starts earliest.
+  * Result: not optimal =>
+  * Counter-Example: if the earliest request i is for a very long interval, then by accepting request i we may have to
+    reject a lot of requests for shorter time intervals
+  * attempt 2: smallest interval of time
+  * result: not optimal  * counter-example: p142 fig.b
+  * Optimal solution: choosing the request that finishes the earliest
+  * Proof of optimality: greedy stays ahead in each step.
+**A Related Problem: Scheduling All Intervals**
+Interval Partitioning Problem
+algorithm on P149
+Proof Approaches:
+ Greedy algorithm stays ahead
+• Does better than any other algorithm at each step
+ Exchange argument
+• Transform any solution into a greedy solution
+ Structural Argument
+• Figure out some structural bound that all solutions
+must meet
+Readable/Interesting:  6/6
 ===== 4.2 Scheduling to Minimize Lateness: An Exchange Argument =====
+The Problem
+We have a single resorce, and a set of job requests with a start time and a deadline. We are trying to schedule the tasks
+to minimize the max lateness.
+opt Algo:
+Earliest deadline first.
+The main step in showing the optimality of our algorithm is to establish
+that there is an optimal schedule that has no inversions and no idle time.
+analysis:
+Our plan here is to gradually
+modify O while preserving its optimality at each step, but eventually transforming
+it into a schedule that is identical to the schedule A found by the greedy
+algorithm. We refer to this type of analysis as an **exchange argument**
+There is an optimal schedule with no idle time.
+All schedules with no inversions and no idle time have the same
+maximum lateness.
+Def. An **inversion** in schedule S is a pair of jobs i and j such that:
+di < dj but j scheduled before i.
+Greedy Exchange Proofs on slide 4.2 p7.
+Running time: nlongn --- need to sort the deadlines first.
+Readable/Interesting:  6/6
+===== 4.4 Shortest Paths in a Graph =====
+problem:
+Given a directed weighted graph, what is the shortest path?
+Dijkstra's Algo:
+The algorithm maintains a set S of vertices u for which we have determined a shortest-path distance d(u)
+from s; this is the "explored" part of the graph. Initially S = {s}, and d(s) = O.
+Now, for each node v ~ V-S, we determine the shortest path that can be constructed by traveling along a path through the explored part S to some u ~ $, followed by the single edge (u, v). That is, we consider the quantity d’(v) = mine=(a,v):a~s d(a) + ~e. We choose the node v e V-S for which t~s quantity is minimized, add v to S, and define d(v) to be the value d’(v).
+We always form shortest new s-v path from a path in S followed by a single edge
+• Proof of optimality: Stays ahead of all other
+solutions
+ Each time selects a path to a node v, that path is
+shorter than every other possible path to v (stay ahead proof)
+proof of correctness:
+**inductive proof**
+Inductive hypothesis: Assume true for |S| = k, k ≥ 1
+Let v be the next node added to S by Greedy, and let uv be
+the chosen edge.
+The shortest s->u path plus u->v is an s->v path of length π(v)
+ Consider any s->v path P. It's no shorter than π(v).
+ Let x->y be the first edge in P that leaves S,and let P' be the subpath to x.
+ P is already too long as soon as it leaves S.
+Question: Still cannot see why the order matter. If we do a BFS and visit all vertices and update all distance labels, wouldn't we get the same result?
+Readable/Interesting:  8/8
+===== 4.5 The Minimum Spanning Tree Proble =====
+Section Summary
+**Motivation of the problem/applications:**
+How to minimize cost of laying cables?
+How to find the minimum spanning tree for a connected weighed graph?
+**Definition: **
+Min Spanning tree:
+spanning: spans all nodes in graph
+Tree:     contains no cycles
+Minimum:  no cycle
+Claim: MST contains no cycle
+Proof: Contradiction, sippose it contains cycle. Then we can remove the long route and keep it a spanning tree while reducing the cost.
+Thus the tree is not minimum.
+**Two Properties:**
+When is it safe to invclude an edge in the MST?
+Cut property. Let S be any subset of nodes, and let e
+be the min cost edge with exactly one endpoint in S.
+Then MST contains e.
+proof on P170
+When is it safe to remove an edge from the MST?
+Cycle property. Let C be any cycle, and let f be the
+max cost edge belonging to C. Then MST does not
+contain f.
+proof on P170
+cycle-cut intersection proof
+**Three algorithms**
+Prim:
+Start with an arbitrary node s and add it to the tree T. Grow T outward by choosing the cheapest edge e with on endpoint in T and the other out of T.
+Proof on P172
+Cut property
+Kruskal:
+Sort the edges in G in ascending orders. pick the smallest edge that does not creat a cycle.
+Reverse-delete:
+It is the rever version of Kruskal
+**Implementations of the two algorithms**
+Questions:
+I had some trouble understanding the proof for Cayley's therom...
+===== 4.6 Implementing Kruskal’s Algorithm: The Union-Find Data Structure =====
+Section Summary:
+Problem Motivation:
+Try to implement the Kruskal's algorithm
+pointer based implementation
+Find Operations
+Log(n)
+Union Operations
+Log(n)
+The Kruskal Algorithm in detail
+Intereting/readible : 5/5
+===== 4.7 Clustering =====
+Section Summary
+definition:
+Divide objects into clusters so that points in
+different clusters are far apart.
+Motivation/Application
+Identify patterns in gene expression
+K-Clustering: try to minimize the distance functions so that the
+farthest two elements in one cluster are not farther apart than any
+elements outside of the cluster.
+It is basically Kruskal's algo except we stop when there are k connected components we seek for.
+proof on P184
+Interesting/readable: 7/7
+===== 4.8 Huffman Codes and Data Compression =====
+summary
+Motivation:
+How to encode data so that transmission could be more efficient?
+Answer: use less bits on the data without much differentiation!(Low entropy data?)
+We use Huffman codes.
+If we use all letters in the same frequency, then there is nothing to encode or compress, but when we do not, which is often the case,
+we can always represent the more frequently used words with less bits.
+Watch out for prefix ambiguity!
+**variable-length encoding**
+Goal: minimize Average number of Bits per Letter (ABL):
+calculate abl using the expected value.
+Algorithm
+implemenation
+  * Binary tree for the prefix codes
+  * Priority queue (with heap) choosing the node with lowest frequency
+Cost (nlogn) Why? logn inserting and dequeuing. do it n times.
+Interesting/readable: 5/8