Differences
This shows you the differences between two versions of the page.
Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
courses:cs211:winter2011:journals:wendy:chapter4 [2011/02/28 23:43] – shangw | courses:cs211:winter2011:journals:wendy:chapter4 [2011/03/02 04:59] (current) – shangw | ||
---|---|---|---|
Line 61: | Line 61: | ||
The readability is 7. | The readability is 7. | ||
+ | ===== Section 6: The Implementing Kruskal' | ||
+ | The main goal of this section is to introduce the Union-Find data structure, which will allow us to quickly find if two nodes are connected, which is exactly needed in the Kruskal' | ||
+ | |||
+ | There are three operations in the data structure, namely MakeUnionFind, | ||
+ | |||
+ | If we use an array to implement the Union-Find, we use the location of each element of the array to represent each node and fill in the number to represent the set the node belongs in. Therefore, we can easily see that the running time for Find is O(1), and MakeUnionFind(S) O(n). Any sequence of k Union operations takes at most O(klogk). Because at most 2k elements are involved in any Union at all and the process happens at most log2(2k) times. | ||
+ | |||
+ | There is actually a better data structure than arrays to implement Union-Find, especially for the Union operation. The key is using a pointer. MakeUnionFind(S) does similar things as above and has each set pointing to the element inside. When union two sets, we simply have the smaller set's pointer points at the larger set's pointer. Obviously, MakeUinonFind takes O(n) time and Union constant time. The Find(v) ultimately does is try to find what element is the set containing v points to. Since the set containing has at least 1 elements, at most n, and its size doubles most logn time, the running time is O(logn). | ||
+ | |||
+ | We can even further improve the algorithm to make Find(v) constant. However, setting up the data structure takes over more time, because we need to go though the pointing path for each element in the sets as we did in the Find above. | ||
+ | |||
+ | After looking into the data structure, it is not hard to implement Kruskal' | ||
+ | |||
+ | This section talks in details of the construction of Union-Find, interesting. The readability is 9. | ||
+ | |||
+ | ===== Section 7: Clustering ===== | ||
+ | This section talks about another problem-Clustering-and how to solve it with greedy algorithm. Clustering (of maximum spacing) can be precisely described as partitioning a collection of objects U such that the maximum possible spacing s achieved (That is, the minimum distance between any pair of points lying in different clusters is minimized). | ||
+ | |||
+ | The algorithm is exactly the Kruskal' | ||
+ | |||
+ | Practical applications can be finding out the clusters and using low-cost network within the cluster and using more robust network between each cluster. | ||
+ | |||
+ | This section is quite short but the application of the minimum spanning tree is interesting. | ||
+ | |||
+ | ===== Section 8: Huffman Code and Data Compression ===== | ||
+ | |||
+ | This section introduces us the Huffman Code and how to use greedy algorithm to achieve. | ||
+ | |||
+ | First, the book gradually introduces the origin of Huffman coding from using 1,0 bits representing letters, the different frequency of usage of letters in English, the Morse code which is similar but easier with the pausing, Prefix codes and the way to calculate the optimal prefix codes (minimize the sum of frequency*bit-length). | ||
+ | |||
+ | To begin design the algorithm, intuitively the Binary Tree comes to mind because if we only use the leaves to represent letters then it is gonna be prefix code since no parent is involved. It is not hard to see that the optimal prefix code corresponds with some full binary tree. Now we attempt to construct the algorithm to build such a full binary tree. The Top-Down Approach as in the S-F code is the first try, unfortunately the example used earlier is already a counter example. To perfectly modify this approach needs to borrow knowledge from dynamic programming, | ||
+ | |||
+ | Compression is not always as easy as the Huffman coding. Tow major challenges are the " | ||
+ | |||
+ | I like the structure of this section because it tends to " | ||
+ | |||
+ | Readability is 8. | ||
+ |