Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
courses:cs211:winter2018:journals:patelk:chapter4 [2018/03/10 16:56] – [4.8 Huffman Codes and Data COmpression] patelkcourses:cs211:winter2018:journals:patelk:chapter4 [2018/03/10 20:29] (current) – [4.8 Huffman Codes and Data COmpression] patelk
Line 345: Line 345:
 Readability: 7 Interesting: 7 Readability: 7 Interesting: 7
  
-===== 4.8 Huffman Codes and Data COmpression =====+===== 4.8 Huffman Codes and Data Compression =====
  
   * Greedy rule to shrink the size of the problem instance so that an equivalent smaller problem can then be solved by recursion   * Greedy rule to shrink the size of the problem instance so that an equivalent smaller problem can then be solved by recursion
Line 405: Line 405:
       * can lock them together because we know they end up as sibling leaves below a common parent       * can lock them together because we know they end up as sibling leaves below a common parent
       * Replace y* and z* with meta-letter, obtaining an alphabet that is one letter smaller. Recursively find a prefix code for the smaller alphabet, and then "open up" the meta-letter back into y* and z* to obtain a prefix code for S:       * Replace y* and z* with meta-letter, obtaining an alphabet that is one letter smaller. Recursively find a prefix code for the smaller alphabet, and then "open up" the meta-letter back into y* and z* to obtain a prefix code for S:
 +//Huffman's Algorithm//
  
 {{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings.png?nolink&400|}} {{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings.png?nolink&400|}}
 +
 {{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings2.png?nolink&400|}} {{:courses:cs211:winter2018:journals:patelk:optimalprefixcodesiblings2.png?nolink&400|}}
  
 +  * bottom-up approach: focuses on the leaves representing the two lowest-frequency letters, and then continues by recursion.
 +
 +**Analyzing the Algorithm**
 +  * //The Optimality of the Algorithm//
 +    * Recursive algorithm: proof by induction
 +    * Base: optimal for all two letter alphabets (one bit per letter)
 +    * By induction: optimal for all alphabets of size k-1
 +    * Consider: alphabet S of size k
 +    * The Huffman code for a given alphabet achieves the minimum average number of bits per letter of any prefix code
 +  * //Implementation and Running Time//
 +    * Can be polynomial time in k, the number of letters in the alphabet
 +      * recursive calls: sequence of k-1 iterations
 +      * each iteration consists simply of identifying the two lowest-frequency letters -> O(k)
 +      * combined: O(k^2)
 +    * Use a priority queue to maintain a set of k elements
 +      * allows for insertion and extraction 
 +      * Each iteration: extract the minimum twice; insert a new letter whose key is the sum of these two minimum frequencies
 +      * Using priority queues via heaps: each insertion and extraction: O(logk) -> over k iterations, O(klogk)
 +
 +**Extensions**
 +  * Data Compression: black and white images already use one bit per letter
 +    * need to be highly compressed: use a "fraction of a bit" for each white bixel
 +  * Cannot adapt to changes in text: frequency is stable for half the text, but then changes radically
 +    * Halfway into the sequence insert some kind of instruction to completely change the encoding
 +    * This will pay off in the long run: called adaptive compression schemes
 +
 +==== Personal Thoughts ==== 
 +
 +While arguably one of the more interesting topics of this textbook, this section was long and very dense. There was a lot of information that was hard to digest all at once. Our discussion in class was helpful, but not long enough to help make this section very clear. I will most likely need to review this chapter section again.
 +Readability: 5 Interesting: 8
  
courses/cs211/winter2018/journals/patelk/chapter4.1520700982.txt.gz · Last modified: by patelk
CC Attribution-Noncommercial-Share Alike 4.0 International
Driven by DokuWiki Recent changes RSS feed Valid CSS Valid XHTML 1.0