Chapter 4
Greedy Algorithm: builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion.
4.1 Interval Scheduling: The Greedy Algorithm Stays Ahead
Designing a Greedy Algorithm
Potential Designs:
Always Select the Available Request that Starts Earliest: a very long request that starts the earliest is not the most optimal
Accept the Request that Requires the Smallest Interval of Time: accepting a short interval in the middle, may prevent the acceptance of two other requests
For each request, cont the number of requests that are not compatible: accept the request that has the fewest noncompatible requests; more complicated but could work
Accept First the Request That Finishes First: maximize the time left to satisfy other requests
Analyzing the Algorithm
R: set of requests that we have neither accepted nor rejected yet
A: done the set of accepted requests
O: optimal set of interval
A is a compatible set of requests
Proofs:
We want to show that the greedy algorithm is doing better in a step-by-step fashion
i1,…,ik - set of requests in A in the order they were added to A
j1,…,jm - set of requests in O ordered in the natural left-to-right order of the corresponding intervals (start to finish points)
For all indices r ⇐ k, we have f(ir) ⇐ f(jr)
r = 1: reselts the request i1 with minimum finish time
for r > 1: assume induction hypothesis is true for r-1, prove for r; by the induction hypothesis, we can assuming that f(ir-1) ⇐ f(jr-1). Greedy algorithm always has the option of choosing jr rather than choosing a later-finishing interval.
For each r, the rth interval it selects finishes at least as soon as the rth interval in O
The greedy algorithm returns an optimal set A
Proof by contradiction
If A is not optimal, O must have more requests. There is a request jk+1 in O. This request starts after jk ends (hence after ik ends). R still contains jk+1, but greedy algorithm stops only when R is empty: contradiction
Implementation and Running Time:
Extensions
Requestors may leave if the schedular waits too long
Online Algorithms: make decisions as time proceeds, without knowledge of future input
Each request carries a certain value/weight
A Related Problem: Scheduling All Intervals
Many identical resources available: goal is to schedule all the requests using as few resources as possible
Proof: Depth of a set of intervals is the maximum number that passes over any single point on the time-line
Designing the Algorithm
d = depth of the set of intervals
Analyzing the Algorithm
If you have d labels at your disposal, then as you sweep through the intervals from left to right, assigning an available label to each interval you enounter, you can never reach a point where all the labels are currently in use
Personal Thoughts
We reviewed many of these concepts in class before Feb Break, so the section made a lot of sense. However, because there are a lot of symbols and equations, it was harder to understand than when it was taught in class. Overall, the concepts were pretty clear though.
Readability: 6.5
Interesting: 7
4.2 Scheduling to Minimize Lateness: An Exchange Argument
The Problem
Single resource, set of n requests to use the resource for an interval
Resource is available at start time s
Request i has a deadline di and requires a contiguous time interval of length ti
Different requests must be assigned nonoverlapping intervals
Goal: satisfy each request, but are allowed to let certain requests run late – want the smallest maximum lateness
Request i is late if it misses a deadline: f(i) > di
Lateness of request i: li = f(i) - di
Designing the Algorithm
Analyzing the Algorithm
Extensions
Personal Thoughts
Similar to the section above, we reviewed many of these concepts in class before Feb Break, so the section made a lot of sense. For some reason, I did find this Greedy algorithm easier to follow than the previous one. We have reviewed the example of the classroom scheduling multiple times in class, so that is very clear to me.
Readability: 8.5
Interesting: 8
4.4 Shortest Paths in a Graph
The Problem
Graphs are used to model networks in which one travels from one point to another
A problem: determine the shortest path between nodes in a graph
s has a path to every node in G, each edge e has a positive length indicating the time (or distance, or cost, etc.) it takes to traverse e
For a path P, the length of P is the sum of all the lengths of all edges in P.
Designing the Algorithm
As each node v is added to the set S, we simply record the edge (u,v) on which it achieved the value min*d(u)+edge length
if (u,v) is the edge we have stored for v, then Pv is just (recursively) the path Pu followed by the single edge (u,v).
Start at v, follow the edge we have stored for v in the reverse direction to u, then follow the edge we have stored for u in the reverse direction to its predecessor, and so on until we reach s
Analyzing the Algorithm
Greedy because it always forms the shortest new s-v path we can make from a path in S followed by a single edge
Each time it selects a path to a node v, that path is shorter than every other possible path to v
Implementation and Running Time
n-1 iterations for a graph with n nodes
each iteration adds a new node v to S
for a graph with m edges, computing all these minima can take O(m) time → O(mn)
Data Structures to Improve Runtime
Using a priority queue, Dijkstra's Algorithm can be implemented on a graph with n nodes and m edges to run in O(m) time, plus the time for n ExtractMin and m Change Key Operations
Maintain the values of the minima rather than recomputing each iteration
Keep the nodes V-S in a priority queue with d'(v) as their keys
—-
Personal Thoughts
This section provided more clarity to the class discussion on Dijkstra's algorithm. However, I did still struggle to understand the whole algorithm in action. I think with more examples and practice, the algorithm will actually be very simple to understand. Further, running through the algorithm with an example will help solidify why we want to use a priority queue.
Readability: 6.5
Interesting: 6
4.5 The Minimum Spanning Tree Problem
The Problem
Find a subset of the edges T in E, so that the graph (V,T) is connected and the total cost is as small as possible.
Let T be a minimum-cost solution to the network design problem defined above. Then (V,T) is a tree.
If some edges are 0 then there can be a minimum-cost solution with extra edges can could optimally be deleted
Subset T from E is a spanning tree of G if (V,T) is a tree
GOAL: Find the cheapest spanning tree
Designing Algorithms
Start without any edges, insert edges from E in order of increasing cost; only insert an edge if it does not create a cycle. Otherwise, discard.
Prim's Algorithm: start with a root node s, try to grow greedily from s, each time we simply add the node that can be attached as cheaply as possible to the partial tree we already have; grow S by one node, adding the node v that minimizes the “attachment cost”
Kruskal's Algorithm: start with a full graph, begin deleting edge in order of decreasing cost. Delete edges as long as it does not disconnect the graph.
Analyzing the Algorithms
Make the assumption that all edge costs are distinct from one another
Cut property: Assume that all edge costs are distinct. Let S be any subset of nodes that is neither empty nor equal to all of V, and let edge e = (v,w) be the minimum-cost edge with one end in S and the other in V-S. Then every minimum spanning tree contains the edge e.
T is a spanning tree that doesn't contain e
Identify an edge e' in T that is more expensive than e
Exchange e for e'
Cost of T' is less than that of T, as desired.
Kruskal's Algorithm produces a minimum spanning tree of G
Consider any edge e added by the Algorithm; let S be the set of all nodes to which v has a path at the moment just before e is added; only one side of e is in S, since adding e does not create a cycle; No such edge from S to V-S has been encountered yet; So e is the cheapest and belongs in the minimum spanning tree.
Prim's Algorithm produces a minimum spanning tree of G
Assume all edge costs are distinct. Let C be any cycle in G, and let edge e=(v,w) be the most expensive edge belonging to C. Then e does not belong to any minimum spanning tree of G.
The Reverse-Delete Algorithm produces a minimum spanning tree of G
Consider any edge e = (v,w) removed by Reverse-Delete. If edge e is removed, it is a part of a cycle. It is the most expensive as it is the first encountered. As long as output is all connected, the proof is done.
Combination of the Cut Property and the Cycle Property implies that any algorithm that builds a spanning tree by repeatedly including and removing edges when justified, will end up with a minimum spanning tree.
If edge costs are not distinct, make them distinct by extremely small numbers. This creates tie-breakers to resolve comparisons.
Implementing Prim's Algorithm
Runtime: O(mlogn)
Keep the nodes in a priority queue with these attachment costs a(v) as the keys
Select a node with extract min; update the attachment costs using change key
n-1 iterations in which we perform extract min
Perform change key at most once for each edge
Can be implemented with n nodes and m edges to run in O(m) time, plus the time for n extract min and m change key operations
Extensions
Point-to-point distances: willing to reduce these even if we pay more for the set of edges
Congestion on the edges: one could seek a spanning tree in which no single edge carries more than a certain amount of traffic
Is a spanning tree event the right solution? Resilience as an explicit goal.
Personal Thoughts
This section was not overly hard to understand. A lot of the concepts were straightforward and we discussed them thoroughly in class. Not surprisingly, the part I struggled with was understanding the various proofs in this section. This section was more proof-heavy than some of the other sections, so that made the section a little more dense. Readability: 6 Interesting: 6
4.6 Implementing Kruskal's Algorithm: The Union-Find Data Structure
Consider a scenario in which a graph evolves through the addition of edges: Fixed population of nodes that grows over time by having edges appear between certain pairs of nodes.
Data structure called Union-Find to store a representation of the components in a way that supports rapid searching and expansion.
The Problem
Union-Find data structure allows us to maintain disjoint sets
If we add edge (u,v) to the graph, first test if u and v are already in the same connected component. If not, Union(Find(u),Find(v)) to merge the two components
MakeUnionFind(S): where all elements are in separate sets
Find(u): return the name of set containing u
Union(A,B): cahnge the data structure by merging the sets A and B into a single set. → O(logn)
A Simple Data Structure for Union-Find
Explicitly maintain the list of elements in each set, so we don't have to look through the whole array to find the elements that need updating.
Maintain an additional array size of length n, where size[A] is the size of set A, and when a Union(A,B) operation is performed, we use the name of the larger set for the union.
After these optimizations: worst case = O(n)
Consider the array implementation of Union-Find data structure for some set S of size n, where unions keep the name of the larger set. The Find operation takes O(1) time, MakeUnionFind(S) takes O(n) time, and any sequence of k Union operations takes at most O(klogk) time.
Consider a sequence of k Union operations
Only updating the array Component is the only part that takes more than O(1) time.
Componenet[v] gets updated at most logbase2(2k). At most 2k elements are involved in any Union operations, so bound of O(klogk)
A Better Data Structure for Union-Find
Use pointers
each node contained in a record with an associated pointer to the name of the set that contains v
Two sets A and B
Pointer-based data structure implements Union in O(1) time
Find operation: number of steps needed is exactly the number of times the set containing node u had to change its name; number of times Component[u] array position is updated: as large as O(n)
To optimize, we keep the name of the larger set as the name of the union.
We will maintain an additional field with the nodes: the size of the corresponding set.
Further Improvements
Implementing Kruskal's Algorithm
Use Union-Find data structure
Sort edges by cost: O(mlogm)
At most one edge between any pair: O(mlogn)
Union-Find structure is used to maintain the connected components as edges are added
For each edge considered, compute Find(u) and Find(v). Check equality to see if they belong to different components
Merge if the algorithm decides to include edge e in the tree T
Personal Thoughts
Just like the last section, this section was pretty reasonable to understand. We did go through this material in class, so that helped. The data operations were pretty simple, though there was a lot of variability in implementation style, which was a little confusing to keep track of. Overall, not too bad though!
Readability: 6.5 Interesting: 6.5
4.7 Clustering
The Problem
Clusterings of Maximum Spacing: Given a set U of n objects; for each pair, there is a numerical distance; distances must be greater than zero and symmetric (pi,pj vs. pj,pi); we want a k-clustering with k groups; spacing of a k-clustering is the minimum distance between any pair of points lying in different clusters
Designing the Algorithm
Consider growing a graph on the vertex set U
Grow a graph H on U edge by edge
Procedure: Kruskal's Algorithm
Analyzing the Algorithm
The components C1, C2,…,CK formed by deleting the k-1 most expensive edges of the minimum spanning tree T constitute a k-clustering of maximum spacing
Spacing of C: length d* of the (k-1)st most expensive edge
Some other k-clustering C' partitions U into nonempty sets.
The two clusterings C and C' are not the same
Personal Thoughts
This section was not bad, because it was very general information. The nitty gritty details that I'm sure are related to clustering were not really covered. The overall idea of clustering makes a lot of sense and the basic algorithm was not complicated to follow (especially since it built off of the last section's Kruskal's Algorithm). I am looking forward to seeing more examples/applications of clustering.
Readability: 7 Interesting: 7
4.8 Huffman Codes and Data Compression
Greedy rule to shrink the size of the problem instance so that an equivalent smaller problem can then be solved by recursion
Data Compression: area that forms part of the foundations for digital communication.
The Problem
Designing the Algorithm
Representing Prefix Codes Using Binary Trees
Rooted binary tree T: number of leaves is equal to the size of the alphabet S; each leaf has a distinct letter in S
For each letter x in S, we follow the path from the root to the leaf labeled x; each time the path goes from a node to its left child, we write down a 0, and each time the path goes from a node to its right child, we write down a 1.
The encoding of S constructed from T is a prefix code.
Given a prefix code y, we can build a binary tree recursively
A binary tree is full if each node that is not a leaf has two children: there are no nodes with exactly one child.
Binary tree corresponding to the optimal prefix code is full
Proof using an exchange argument
T = binary tree corresponding to the optimal prefix code
Suppose it contains a node u with exactly one child v
Convert T into T' by replacing node u with v.
if u = root, delete u and use v as root
otherwise, let w be the parent of u in T, delete u, and make v be a child of w in place of u
A First Attempt: The Top-Down Approach
What If We Knew The Tree Structure of the Optimal Prefix Code?
Assume we have the binary tree T* corresponding to an optimal prefix code - need to figure out the labeling
Suppose that u and v are leaves of T*, such that depth(u) < depth(v). Further suppose that in a labeling of T* corresponding to an optimal prefix code, leaf u is labeled with y in S and leaf v is labeled with z in S. Then fy >= fx.
Proof using an exchange argument
if fy<fz, then effect of the exchange is as follows: the multiplier on fy increases [from depth(u) to depth(v)] and the multiple on fz decreases by the same amount [from depth(v) to depth(u)].
Having a lower-frequency letter at a strictly smaller depth than some other higher-frequency letter rules out for an optimal solution
Take all leaves of depth 1, label them with the highest-frequency letters, do the same with each level of increasing depth → leads to an optimal T*
w is a leaf of T*
An Algorithm to Construct an Optimal Prefix Code
Huffman's Algorithm
Analyzing the Algorithm
Extensions
Data Compression: black and white images already use one bit per letter
Cannot adapt to changes in text: frequency is stable for half the text, but then changes radically
Personal Thoughts
While arguably one of the more interesting topics of this textbook, this section was long and very dense. There was a lot of information that was hard to digest all at once. Our discussion in class was helpful, but not long enough to help make this section very clear. I will most likely need to review this chapter section again.
Readability: 5 Interesting: 8