This is an old revision of the document!

Chapter 4 – Greedy Algorithms

My notes on the assigned sections of Chapter 4 of Algorithm Design by Jon Kleinberg and Éva Tardos. This chapter details greedy algorithms. A greedy algorithm is an algorithm that “builds up a solution in small steps, choosing a decision at each step myopically to optimize some underlying criterion.”

4.1 – Interval Scheduling: The Greedy Algorithm Stays Ahead

The main challenge in designing a good greedy algorithm is deciding which rule to use for the selection process. For interval scheduling, some rules that might seem good are minimal start time and the smallest interval time, but the best rule is to use the request that finishes first. This way, we maximize the time left over to service the other requests. To prove that our greedy solution, A, is optimal, we must show that A contains the same number of intervals as an optimal solution, O. We then can compare partial solutions, showing that A is doing better in a step-by-step fashion.

Interval Scheduling Algorithm:

Sort jobs by finish times so that f1 ≤ f2 ≤ ... ≤ fn
G = {}
for j = 1 to n
    if job j compatible with G
        G = G ∪ {j}
return G

This algorithm runs in O(nlogn) time because we first sort the request, taking O(nlogn), then we construct an array in O(n) time, and we do one pass through n intervals, spending constant time per interval, which takes O(n) time. So, overall we can say the algorithm is O(nlogn).

A similar problem is the Interval Partitioning Problem in which we must partition all intervals across multiple resources (ex. scheduling classes in classrooms). In any instance of Interval Partitioning, the number of resources is going to be at least the depth of the set of intervals.

Interval Partitioning Algorithm:

Sort intervals by starting time so that s1 ≤ s2 ≤ ... ≤ sn
d = 0
for j = 1 to n
    if lecture j is compatible with some classroom k
        schedule lecture j in classroom k
    else
        allocate a new classroom d + 1
        schedule lecture j in classroom d + 1
        d = d + 1

This algorithm runs in O(nlogn) time because for each classroom k, we maintain the finish time of the last job added, and we can keep the classrooms in a priority queue by the last job's finish time.

I'd give this section an 8/10 on both readability and interestingness.

4.2 – Scheduling to Minimize Lateness: An Exchange Argument

In this problem, we have a single resource and a set of n requests to use this resource for a given time interval. However, instead of a start and finish time, the request has a deadline, and it requires a time interval length, t, and it can be scheduled at any point before the deadline. The goal is to minimize the lateness of the request (finish time minus deadline). The correct greedy approach for this problem is to sort the jobs in increasing order of their deadlines, and then schedule them in this order. This way, we make sure that jobs with earlier deadlines get completed earlier.

Minimize Lateness:

Sort n jobs by deadline so that d1 ≤ d2 ≤ … ≤ dn t = 0
for j = 1 to n
    Assign job j to interval [t, t + tj]
    sj = t
    fj = t + tj
    t = t + tj
output intervals [sj, fj]

I'd give this section a 10/10 on readability and a 7/10 on interestingness.

4.4 – Shortest Paths in a Graph

This section examines a greedy algorithm that calculates the shortest paths with a designated starting node, s, and the assumption that s has a path to every other node in the graph. Such an algorithm was created by Edsger Dijkstra in 1959, and the algorithm became known as Dijkstra's Algorithm.

Dijkstra's Algorithm:

Dijkstra's Algorithm (G, l):
Let S be the set of explored nodes
    For each u in S, we store a distance d(u)
Initially S = {s} and d(s) = 0
While S != V
    Select a node v in S with at least one edge from S for which
        d'(v) = min(e=(u,v):u in S) d(u) + le is as small as possible
    Add v to S and define d(v) = d'(v)
EndWhile

Dijkstra's Algorithm is greedy because “we always form the shortest new s-v path we can make from a path in S followed by a single edge”. Its correctness can be proved using a Greedy Stays Ahead proof. Each time it selects a path to a node v, such path is shorter than any other possible path there. Its runtime is O(mlogn) where m is the number of edges and n is the number of nodes. Seemingly, the runtime seems like it would be O(mn), but using the right data structures drastically improves efficiency. We use a priority queue to keep the nodes V - S with d'(v) as their key. So, the algorithm can run in O(m) time, “plus the time for n ExtractMin and m ChangeKey operations”.

This section was very readable, and I'd give it a 8/10 on readability and a 7/10 on interestingness.

Table of Contents

Chapter 4 – Greedy Algorithms

4.1 – Interval Scheduling: The Greedy Algorithm Stays Ahead

4.2 – Scheduling to Minimize Lateness: An Exchange Argument

4.4 – Shortest Paths in a Graph

4.5 – The Minimum Spanning Tree Problem