Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2011:journals:charles:chapter4 [2011/02/19 07:27] – [4.7 Clustering] gouldc
+++ courses:cs211:winter2011:journals:charles:chapter4 [2011/02/20 02:45] (current) – [4.8 Huffman Codes and Data Compression] gouldc
@@ Line 40: / Line 40: @@
   - Union(A,B): merges two components (A,B) into one component; O(log n).
-EFFICIENCY(4.25): Kruskal's Algorithm can be implemented on a graph with n nodes and m edges to run in O(m logn) time.
+Efficiency (4.25): //Kruskal's Algorithm can be implemented on a graph with n nodes and m edges to run in O(m logn) time.//
 ===== 4.7 Clustering =====
-We want to group a collection of objects. We will be thinking about minimum spanning trees when we do this, specifically Kruskal's Algorithm.
+Situation: Given a collection of objects, we want k clusters (but not any clustering, we want the "clustering of maximum spacing"). Brute force approach to finding the best clustering is terribly inefficient, since there are "exponentially many different k-clusterings of a set U." The answer is related to minimum spacing trees.
-DECIDED GOAL: We want to find a "clustering of maximum spacing." Suppose we want k clusters... "There are exponentially many different k-clustering of a set U; how can we efficiently find the one that has maximum spacing?"
+A variation of Kruskal's Algorithm solves it (4.26) //The components C1, C2, ..., Ck formed by deleting the k - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.//
-How is this related to minimum spacing trees?  We are doing Kruskal's Algorithm exactly. The only difference is that we stop when there are k clusters.
-(4.26) //The components C1, C2, ..., Ck formed by deleting the k - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.//
 ===== 4.8 Huffman Codes and Data Compression =====
+Huge research area. Think languages. If there are n letters in a language, it's possible to encode the letters in log(n) bits. But we know that letters come up with different frequencies. Suppose we are given the individual frequencies of each letter; is it possible to encode the more frequent letters with fewer bits so as to reduce the average bits per letter (ABL)?
+Yeah, Huffman coding is one way.
 ===== 4.9 Minimum-Cost Arborescences: A Multi-Phase Greedy Algorithm =====
 Under construction...