Differences

This shows you the differences between two versions of the page.

--- courses:cs211:winter2014:journals:shannon:chapter4 [2014/02/25 13:46] – [Section 4.2: Scheduling to Minimize Lateness: An Exchange Argument] nollets
+++ courses:cs211:winter2014:journals:shannon:chapter4 [2014/03/03 21:10] (current) – [Section 4.8: Huffman Codes and Data Compression] nollets
@@ Line 20: / Line 20: @@
 =====Section 4.4: Shortest Paths in a Graph=====
+The problem of finding the shortest path in the graph can be solved by using a greedy algorithm. We use this algorithm on directed graphs, where each to the edges have a specific weight. The algorithm we used was designed by Edsger Dijkstra and we at each point choose the shortest path to the next node. The algorithm, on page 138 of the text, seems a little confusing at points, but does manage to produce an optimal solution every time. We just have to be careful to not grow the set S by the wrong node, which would give us an incorrect shortest path. Thus at any point in our algorithm, the path that we have chosen is the shortest path.
+This algorithm does not work for negative lengths and is essentially an extension of the breadth-first search.
+Now looking at the run time of the algorithm we can see that the algorithm, without the creation of helper operations that will be created outside of the algorithm. If we utilize ChangeKey and ExtractMin, by keeping the nodes in a priority queue, we are able to run this algorithm in O(mlogn) time.
+The motivation behind this problem is to determine how quickly we can get from one point to another, such as from one website to another.
+We went over this algorithm pretty extensively in class and that was much more helpful than the book. Trying to read about the algorithm was extremely confusing.
+I would give this section a 6/10 as it was confusing to read and class was much more helpful.
 =====Section 4.5: The Minimum Spanning Tree Problem=====
+The minimum scanning tree problem is looking for a connection between nodes that is built as cheaply as possible. A graph G has have exponentially many spanning trees and a subset of the tree T, is a spanning tree if (V,T) is a tree as well. There are three greedy algorithms that are able to solve this spanning tree problem : Kruskal's algorithm, Prim's algorithm, and Reverse-Delete algorithm. Each of these is guaranteed to produce an optimal solution and shows that there is a robustness to the minimum spanning tree. Each of these algorithms work differently, but are essentially doing the same thing; repeatedly inserting or deleting edges form a partial solution.
+The section discusses the Cut Property, which is discussed on page 145 of the text. We then use this Cut Property to determine the optimality of the different algorithms. The Prim and Kruskal's algorithms only remove edges when it is approved by the Cut Property.
+There is another property called the Cycle Property, discussed on page 147 of the text. We can use the cycle property to prove that the Reverse-Delete algorithm is optimal as it only adds edges approved by the cycle property.
+Prim and Kruskal's algorithms run in O(mlogn) time by using priority queues to implement them.
+I missed the class where we discussed this section, so it was a little confusing to read. This summary is a tad short, but that is because I am still processing what I have read. I think I understand what is happening, but I will probably go through and add more to this after reading the section again.
+I give this section a 7/10 as I am still working on understanding it.
 =====Section 4.6: Implementing Kruskal's Algorithm: The Union-Find Data Structure=====
+We did not have a class on this section so it is taking me a little longer than I was expecting to go through the section and understand.
+In this section we look at graphs that change with time and creating connected components using Kruskal's algorithm. The Union-Find data structure enables us to maintain disjoint sets.
+I think I may need to come in to speak with you about this section as I am pretty confused trying to read it from the chapter. I am having difficulty discerning what the pertinent information is. I am going to revisit this later in the week and figure out what exactly is going on!
+=====Section 4.7: Clustering=====
+Clustering is the solution to the problem where we want to organize objects into coherent groups where the closer the objects are, the more they have in common. Then we want to find ckusterings with the maximum spacing. Given a value k, we want to find a solution with k clusters that has the clusters as far apart from each other as possible.
+In order to find this maximum spacing we use a modified version of Kruskal's algorithm. We start by adding an edge between the two closest points and then the next two closet, etc. We do not add edges between nodes that are already in the same cluster as that would create a cycle. We run the Kruskal's algorithm but we stop after we create k connected components. So we do not add the k-1 expensive edges.
+This section was pretty straightforward and we went over it pretty well in class. It helps that we were using an algorithm that we had already proved and discussed. I would give this section a 8/10. It was helpful and easy to follow.
+=====Section 4.8: Huffman Codes and Data Compression=====
+When communicating with computers, we must change English characters to binary digits. We want to minimize the number of bits needed to encode each letter. The problem with encoding is that, as in morse code, there could potentially be ambiguous codes. This means that the code for one letter code be a prefix of another code. A prefix code is a set of encodings that does not contain any prefixes. Then once we have created this code we are able to translate a string of bits without any ambiguity.
+This section also looks at optimal prefix codes which have the smallest average number of bits required per letter.
+The algorithm to create the optimal prefix code can be used multiple ways. We can start with the encoding first, and create a binary tree from it. Each time a path to a leaf branches left, we write down a zero and each time a path to a leaf branches right, we write down a one. We can also start with a binary tree and from it deduce the encoding.
+We say that a binary tree is full if every internal node has two children.
+When we finally look at creating the algorithm we look at a bottom up approach as opposed to the top down approach which has to use the brute-force algorithm as opposed to the optimal one. The algorithm, discussed on page 172 of the text, starts by placing all of the letters, with their frequencies in some order. Then you match the two letters with the lowest frequencies with each other under a meta letterwith the frequncy equal to the sum of the two letters frequencies. You then repeat this process until allot the nodes are matched up and the frequency of the meta letter is one. This algorithm is called Huffaman's algorithm. This is a greedy algorithm because at each point we choose the two lowest frequeny letters. This algorithm will run in O(klogk) time if we use a priority queue to keep track of the letters and their frequencies.
+Prefix codes have some drawbacks including an inability to change as more text is added and that you cannot use partial bits. However, we choose prefix codes as a trade off between the power of compression and the computational cost.
+This was a little confusing to reAd, but the explanations in class were very helpful.
+I would give this section a 8/10 as I never thought of this problem and I think it is pretty cool that they have created a solution to it!