Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
courses:cs211:winter2011:journals:chen:chapter_4 [2011/02/26 23:50] – [4.5 The Minimum Spanning Tree Proble] zhongc | courses:cs211:winter2011:journals:chen:chapter_4 [2011/03/02 03:13] (current) – zhongc | ||
---|---|---|---|
Line 129: | Line 129: | ||
**Two Properties: | **Two Properties: | ||
+ | When is it safe to invclude an edge in the MST? | ||
+ | Cut property. Let S be any subset of nodes, and let e | ||
+ | be the min cost edge with exactly one endpoint in S. | ||
+ | Then MST contains e. | ||
+ | proof on P170 | ||
- | Cut | + | When is it safe to remove an edge from the MST? |
+ | Cycle property. Let C be any cycle, and let f be the | ||
+ | max cost edge belonging to C. Then MST does not | ||
+ | contain f. | ||
+ | proof on P170 | ||
- | Cycle | + | cycle-cut intersection proof |
**Three algorithms** | **Three algorithms** | ||
Prim: | Prim: | ||
+ | Start with an arbitrary node s and add it to the tree T. Grow T outward by choosing the cheapest edge e with on endpoint in T and the other out of T. | ||
+ | Proof on P172 | ||
+ | Cut property | ||
Kruskal: | Kruskal: | ||
+ | Sort the edges in G in ascending orders. pick the smallest edge that does not creat a cycle. | ||
Reverse-delete: | Reverse-delete: | ||
+ | It is the rever version of Kruskal | ||
+ | |||
+ | **Implementations of the two algorithms** | ||
+ | |||
+ | |||
+ | |||
Line 146: | Line 165: | ||
I had some trouble understanding the proof for Cayley' | I had some trouble understanding the proof for Cayley' | ||
+ | |||
+ | ===== 4.6 Implementing Kruskal’s Algorithm: The Union-Find Data Structure ===== | ||
+ | |||
+ | Section Summary: | ||
+ | |||
+ | Problem Motivation: | ||
+ | Try to implement the Kruskal' | ||
+ | |||
+ | pointer based implementation | ||
+ | |||
+ | Find Operations | ||
+ | Log(n) | ||
+ | |||
+ | Union Operations | ||
+ | Log(n) | ||
+ | |||
+ | |||
+ | The Kruskal Algorithm in detail | ||
+ | |||
+ | |||
+ | Intereting/ | ||
+ | |||
+ | |||
+ | |||
+ | ===== 4.7 Clustering ===== | ||
+ | |||
+ | Section Summary | ||
+ | definition: | ||
+ | Divide objects into clusters so that points in | ||
+ | different clusters are far apart. | ||
+ | |||
+ | Motivation/ | ||
+ | Identify patterns in gene expression | ||
+ | |||
+ | K-Clustering: | ||
+ | farthest two elements in one cluster are not farther apart than any | ||
+ | elements outside of the cluster. | ||
+ | |||
+ | It is basically Kruskal' | ||
+ | |||
+ | proof on P184 | ||
+ | |||
+ | Interesting/ | ||
+ | |||
+ | |||
+ | ===== 4.8 Huffman Codes and Data Compression ===== | ||
+ | |||
+ | summary | ||
+ | |||
+ | Motivation: | ||
+ | How to encode data so that transmission could be more efficient? | ||
+ | Answer: use less bits on the data without much differentiation!(Low entropy data?) | ||
+ | |||
+ | We use Huffman codes. | ||
+ | |||
+ | If we use all letters in the same frequency, then there is nothing to encode or compress, but when we do not, which is often the case, | ||
+ | we can always represent the more frequently used words with less bits. | ||
+ | |||
+ | Watch out for prefix ambiguity! | ||
+ | |||
+ | **variable-length encoding** | ||
+ | |||
+ | Goal: minimize Average number of Bits per Letter (ABL): | ||
+ | calculate abl using the expected value. | ||
+ | |||
+ | Algorithm | ||
+ | |||
+ | implemenation | ||
+ | |||
+ | * Binary tree for the prefix codes | ||
+ | * Priority queue (with heap) choosing the node with lowest frequency | ||
+ | |||
+ | Cost (nlogn) Why? logn inserting and dequeuing. do it n times. | ||
+ | |||
+ | Interesting/ |