Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| courses:cs211:winter2011:journals:chen:chapter_4 [2011/02/27 04:08] – [4.5 The Minimum Spanning Tree Proble] zhongc | courses:cs211:winter2011:journals:chen:chapter_4 [2011/03/02 03:13] (current) – zhongc | ||
|---|---|---|---|
| Line 165: | Line 165: | ||
| I had some trouble understanding the proof for Cayley' | I had some trouble understanding the proof for Cayley' | ||
| + | |||
| + | ===== 4.6 Implementing Kruskal’s Algorithm: The Union-Find Data Structure ===== | ||
| + | |||
| + | Section Summary: | ||
| + | |||
| + | Problem Motivation: | ||
| + | Try to implement the Kruskal' | ||
| + | |||
| + | pointer based implementation | ||
| + | |||
| + | Find Operations | ||
| + | Log(n) | ||
| + | |||
| + | Union Operations | ||
| + | Log(n) | ||
| + | |||
| + | |||
| + | The Kruskal Algorithm in detail | ||
| + | |||
| + | |||
| + | Intereting/ | ||
| + | |||
| + | |||
| + | |||
| + | ===== 4.7 Clustering ===== | ||
| + | |||
| + | Section Summary | ||
| + | definition: | ||
| + | Divide objects into clusters so that points in | ||
| + | different clusters are far apart. | ||
| + | |||
| + | Motivation/ | ||
| + | Identify patterns in gene expression | ||
| + | |||
| + | K-Clustering: | ||
| + | farthest two elements in one cluster are not farther apart than any | ||
| + | elements outside of the cluster. | ||
| + | |||
| + | It is basically Kruskal' | ||
| + | |||
| + | proof on P184 | ||
| + | |||
| + | Interesting/ | ||
| + | |||
| + | |||
| + | ===== 4.8 Huffman Codes and Data Compression ===== | ||
| + | |||
| + | summary | ||
| + | |||
| + | Motivation: | ||
| + | How to encode data so that transmission could be more efficient? | ||
| + | Answer: use less bits on the data without much differentiation!(Low entropy data?) | ||
| + | |||
| + | We use Huffman codes. | ||
| + | |||
| + | If we use all letters in the same frequency, then there is nothing to encode or compress, but when we do not, which is often the case, | ||
| + | we can always represent the more frequently used words with less bits. | ||
| + | |||
| + | Watch out for prefix ambiguity! | ||
| + | |||
| + | **variable-length encoding** | ||
| + | |||
| + | Goal: minimize Average number of Bits per Letter (ABL): | ||
| + | calculate abl using the expected value. | ||
| + | |||
| + | Algorithm | ||
| + | |||
| + | implemenation | ||
| + | |||
| + | * Binary tree for the prefix codes | ||
| + | * Priority queue (with heap) choosing the node with lowest frequency | ||
| + | |||
| + | Cost (nlogn) Why? logn inserting and dequeuing. do it n times. | ||
| + | |||
| + | Interesting/ | ||
