Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2018:journals:shermanc:chapter4 [2018/02/27 04:42] – created shermanc
+++ courses:cs211:winter2018:journals:shermanc:chapter4 [2018/03/20 21:30] (current) – shermanc
@@ Line 39: / Line 39: @@
 This algorithm was enlightening, as it reinforced the lesson in class and built upon it.  8/10 for readability and clarity, as well as interesting content.
+===== 4.5: The Minimum Spanning Tree Problem =====
+In a minimum spanning tree, we are looking to build a connected graph as cheaply as possible.  It also cannot contain any cycles, or else the problem would be using unnecessary moves, as it would go through the same node twice.
+Curiously enough, the Minimum Spanning Tree Problem can be found a number of ways; three to be exact.  The first is found by beginning with an empty set and inserting the edges with the least cost first, building the tree one node at a time.  If adding a certain edge results in a cycle, then that edge is simply discarded to prevent this.  This is //Kruskal's Algorithm//.
+The next is called //Prim's Algorithm//.  It is a spin-off of Dijkstra's in the fact that you start with a root node and "greedily" grow the tree out from this starting node, by successively adding the node with the next cheapest cost.
+The last method is where we "run a sort of backward version of Kruskal's Algorithm."  Starting with a full graph, we delete the edges from highest to least cost, unless doing so would disconnect the graph.  This is usually called //Reverse Delete-Algorithm//.
+Prim and Kruskal can be run in O(mlogn), but Reverse-Delete is not as easy to do in this time.  For Prim, this is because, like Dijkstra's, it uses a heap-based priority queue with //m// edges at most, and then ExtractMin and ChangeKey run in O(logn) time, so that is how we get the desired runtime.  Kruskal's and Reverse-Delete runtimes will be discussed in the following sections.
+Overall, this chapter was pretty drawn out with large proofs and simple ideas that went on and on, just to be put in more understandable terms in the following paragraph.  I give it a 3/10.
+===== 4.6: Implementing Kruskal's Algorithm: The Union-Find Data Structure =====
+In order to implement this algorithm, we will create a structure called the Union-Find structure.  This allows us to check both sides of an edge as we consider each edge.  If they are the same, a path is already included, so they will not be added.  But, if they are different, they will be added.  This works great as we add edges, because we just have a check with an if statement, but the same cannot be said for deletion, which is not supported.  But, we can simply take two arrays and, for each n, use a Find operation to compare the two and then add the union to a final set, which will all take O(klogk) time, as we said in the last section.
+We can actually have a better data structure, though, that uses pointers.  This is useful, as we just update the pointers from pointing at themselves to pointing at the set they are now unioned with.  The problem with this is that the Find operation now takes O(logn).
+Whichever one we use, it will be done in O(mlogn) time, it is just a matter of some slight improvements that aren't //too// noticeable in the long run.
+This section was shorter but a more dull read as it was review basically, 5/10.
+===== 4.7: Clustering =====
+This problem had to do with examples such as a collection of photos, letters, etc. that we would try to classify into groups, or clusters.  To start, we have a graph that we will grow by first connecting the edges with the closest cost.  The connection here is that we are just using Kruskal's Algorithm, stopping when we obtain the specified number of connected components.  So, we basically do everything in Kruskal's, but delete the "k - 1 most expensive edges."  This allows us to get a cluster with the closest related set.
+This section was actually pretty interesting and extremely short, so 8/10.
+===== 4.8: Huffman Codes and Data Compression =====
+The whole deal with Huffman Codes was that they were a way to find how to encode symbols using bits.  So, the book discussed how to go about coding them and then, in turn, decoding them.
+The goal is to produce a "labeled binary tree in which the leaves are as close to the root as possible."  This is because this gives us an average leaf depth.  What we end up doing is constructing a prefix code for a given alphabet.  We pick out the two letters that occur least frequently and assign them as variables.  Then we make a new alphabet excluding these letters, but replacing them with a new letter that occurs the amount of both least-occurring letters combined.  Then we recursively construct a prefix code on this new alphabet.  The way we get a prefix for the original alphabet is by taking the given tree from our newest alphabet and take the letter we replaced the least-occurring letters with, and give it two children: the original least-occurring letters.  This final tree will give us our desired prefix code as defined as a //Huffman code//.
+This algorithm runs in O(k<sup>2</sup>), because it first does a full scan over an alphabet of size k and then proceeds to iterate k-1 times over the alphabet.  If we use a priority queue, however, we can get it to run in O(klogk) time, as the priority queue allows for insertions and extractions of O(logk).
+This was a very interesting section, talking about encoding and the uses for it.  Also, the reason why the algorithm works and how to use it was interesting too, so I give it a 9/10.