Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2011:journals:charles:chapter4 [2011/02/20 02:27] – [4.8 Huffman Codes and Data Compression] gouldc
+++ courses:cs211:winter2011:journals:charles:chapter4 [2011/02/20 02:45] (current) – [4.8 Huffman Codes and Data Compression] gouldc
@@ Line 42: / Line 42: @@
 Efficiency (4.25): //Kruskal's Algorithm can be implemented on a graph with n nodes and m edges to run in O(m logn) time.//
 ===== 4.7 Clustering =====
-We want to group a collection of objects. We will be thinking about minimum spanning trees when we do this, specifically Kruskal's Algorithm.
+Situation: Given a collection of objects, we want k clusters (but not any clustering, we want the "clustering of maximum spacing"). Brute force approach to finding the best clustering is terribly inefficient, since there are "exponentially many different k-clusterings of a set U." The answer is related to minimum spacing trees.
-DECIDED GOAL: We want to find a "clustering of maximum spacing." Suppose we want k clusters... "There are exponentially many different k-clustering of a set U; how can we efficiently find the one that has maximum spacing?"
+A variation of Kruskal's Algorithm solves it (4.26) //The components C1, C2, ..., Ck formed by deleting the k - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.//
+===== 4.8 Huffman Codes and Data Compression =====
+Huge research area. Think languages. If there are n letters in a language, it's possible to encode the letters in log(n) bits. But we know that letters come up with different frequencies. Suppose we are given the individual frequencies of each letter; is it possible to encode the more frequent letters with fewer bits so as to reduce the average bits per letter (ABL)?
-How is this related to minimum spacing trees?  We are doing Kruskal's Algorithm exactly. The only difference is that we stop when there are k clusters.
+Yeah, Huffman coding is one way.
-(4.26) //The components C1, C2, ..., Ck formed by deleting the k - 1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing.//
-===== 4.8 Huffman Codes and Data Compression =====
-Huge research area. The main example is language: n letters, each letter has a different frequency in the language. It's possible to encode the letters in log(n) bits... but we want to encode the more frequent letters with fewer bits to reduce the average bits per letter (ABL). Huffman coding makes this possible.
 ===== 4.9 Minimum-Cost Arborescences: A Multi-Phase Greedy Algorithm =====
 Under construction...