Summary: Section 3.3 is Implementing Graph Traversal Using Queues and Stacks. This section finally addresses using lists vs arrays to represent graphs, and whether a queue or a stack is better for implementing BFS and DFS. There are two ways to represent a graph: adjacency matrix or adjacency list. While the authors introduce both, they choose to use the adjacency list. Before weighing the pros and cons of using an adjacency matrix vs list, the authors comment on using both input parameters (number of nodes and number of edges) to determine runtime because it is difficult to say, for example, whether O(m^2) or O(n^3) is better. It depends on the relationship between the number of nodes and edges. Since we aim for polynomial runtime in general, we aim for O(m+n) runtimes in this section. In terms of space requirements, the adjacency list wins. The adjacency matrix requires O(n^2) space, whereas the adjacency list require O(m+n) space. In terms of accessing information, the authors also argue in favor of the adjacency list. They say that “if the algorithm is currently looking at a node u, it can read the list of neighbors in constant time per neighbor” (p 89), which leads to a natural sense of “exploring” the graph. Before moving on to the implementation details of BFS and DFS, the section reviews queues and stacks: queues are FIFO, stacks are LIFO, and both can be implemented with a doubly-linked list which keeps pointers to both the head and the tail. BFS will be implemented using a queue, and DFS with a stack.
About the Algorithms: There are two algorithms in this section: implementing BFS and implementing DFS.
In BFS, we need to scan edges and only add one to a layer if it has not yet been added. As such, we create an array Discovered to maintain knowledge about which nodes have been discovered. Recall, BFS builds layers based on which nodes are connected to each other. We maintain the layers in an array. L[0] contains the starting node; L[1] contains the nodes in layer 1. It actually does not matter how the lists of nodes are implemented (as queues or stacks) because “the algorithm is allowed to consider the nodes in a layer L_i in any order” (p 91). To set up, the algorithm sets Discovered[s] to true for the starting node s and false for all other nodes, initializes L[0] to include only node s, sets the layer counter i to 0, and sets the BFS tree to an empty tree. The algorithm runs by checking if the current layer L[i] is empty or not. If it's nonempty, it initializes an empty list in L[i+1]. Then for each node in L[i], it considers each edge incident to that node. If the node is not already discovered, it sets it to discovered, adds the edge to the tree, and adds the node to L[i+1]. Because the graph is represented as an adjacency list, it can consider the next neighbor of a node in constant time (as mentioned in Section 3.2). If the graph is represented using an adjacency list, BFS is O(m+n). Using a queue in the algorithm is mostly a side note, where the authors say that it doesn't change anything if you use a single list L implemented as a queue rather than an array L of lists L[i]. The subsection closes with a remark about the runtime for finding the set of all connected components of a graph, introduced in the previous section. That process is O(m+n).
The last section defined DFS as a recursive process, but in this section we see that we're really maintaining the to-be-processed nodes in a stack. There are two main differences between BFS and DFS. DFS is “more impulsive” than BFS, so when “it explores a node u, it scans the neighbors of u until it finds the first not-yet-explored node v (if any), and then it immediately shifts attention to exploring v” (p 92). Contrast that with BFS where neighboring nodes aren't explored until all neighboring nodes are discovered. BFS keeps track of Discovered nodes, whereas DFS keeps track of Explored nodes. The other difference is that we have to keep track of a node's parent and only add edges (u, parent[u]) to the tree when u becomes Explored. To set up (including making the tree, which the authors failed to include in the algorithm), the algorithm initializes the stack to contain the starting node s, sets Explored to false for all nodes, initializes an array for parents, and sets the DFS tree to an empty tree. The algorithm runs by checking if the stack is nonempty. While it's nonempty, it pops a node u from the stack. If that node is not yet marked as Explored, it sets Explored to be true, adds edge (u, parent[u]) to the tree (if u = s, that node is the root), and then adds each neighboring node v to the stack. When a node v is added to the stack, parent[v] gets set to u (can/will get overwritten). If the graph is represented using an adjacency list, BFS is O(m+n)
My Questions: The authors note: “a running time of O(m+n) is the same as O(m), since m >= n-1” (p 87). Why, then, do we bother saying runtimes are O(m+n) when, according to the authors, it's the same as O(m)? Also, why did the authors decide to only write out a partial DFS algorithm? The algorithm on page 93 does not include the processes involved with adding edges to the tree, which is annoying.
Second Time Around: I don't think I was aware the first time talking about BFS in class that we're representing the layers like an adjacency list: an array of lists. That's clear to me now.
Note to Self: The subsection about finding the set of all connected components made a comment that BFS and DFS “spend work only on edges and nodes in the connected component containing the starting node. (they never see any of the other nodes or edges)” (p 94). That's why finding the set of all connected components is still O(m+n) even though the algorithms to find one connected component (BFS or DFS) are O(m+n) themselves.
Readability: 6. I don't think the authors did a great job of making the algorithm implementations clear. I had to read a few things multiple times. For example, its not clear whether the sentence, “The adjacency list data structure is ideal for implementing breadth-first search” (p 90), refers to having the graph in that form or making the layer data structure an adjacency list, or both. Also, the sentence, “each node occurs on at most one list” (p 91), confused me at first because I was thinking of the graph's adjacency lists where each node is actually represented twice, but it's talking about the layers. It's true that each node appears in only one layer. Other things are stated without much explanation. “Note that there are at most n lists L[i]” doesn't have an explanation, but I presume it's because in the worst case the graph is a line and there is only one node per layer. Also, “we spend O(1) time considering each edge” doesn't have an explanation, but I presume that's true because we're assuming the graph is represented using an adjacency list and you can get to the next edge in the list of a node's edges in constant time.