Chapter 3

Chapter 3

Section 3.1 Definitions and Applications for graphs

A graph is a structure that represents relationships between two elements. They consists of nodes or vertices V and edges E which join the nodes to one another. An edge can be represented as a two-element subset of V: e = {u,v} where u and v are nodes at either end of e. The edges of some graphs have an order, meaning that the relationship moves in one direction but not the other. The edges in such a graph are represented by and ordered pair (u,v) where u is the head and v is the tail.

Graphs serve as important models for lots of different types of networks. Some examples from the book are listed below:

Transportation networks with airports as nodes and non-stop flights as edges
Lan networks with computers as nodes and physical links as edges
Networks with all the machines using one ISP as nodes and ISPs with peering relationships as edges
The internet, with webpages as nodes and hyperlinks as edges
Social networks with people as nodes and relationships between them as edges

Essential to the concept of graphs is the idea of paths and connectivity. A path is a sequence of connected nodes from one vertex to another. A cycle is a special case of a path that occurs when the path begins and ends at the same vertex. An undirected graph is connected if there is a path from every vertex u to v. A directed graph is strongly connected if for every node u and v there is an edge (u,v) and an edge (v,u). The distance of a path is the number of edges in the path. A tree is a special case of a graph that occurs when the undirected graph is connected and there are no cycles. Since there are no cycles in a tree, it is a good way to represent a hierarchy. Additionally, since there are no cycles, each vertex is only connected to its parent and therefore in a tree with n nodes, there are n-1 edges.

This chapter was a breeze to read since it was mostly just term definitions and little uppper-level conceptual analysis.

Section 3.2 Graph Connectivity and Traversal

Breadth-First Search is a method of search through a graph to assemble the connected component. It does this by adding a “layer,” or the set of nodes connected to the root node s, and then recursively calling itself on each node in the layer. A few interesting properties of the breadth-first search and its resultant tree are that for any nodes x and y in layers L_i and L_j joined by an edge (x,y), i and j can differ by at most 1. Additionally, the nodes returned by a BFS started at node n are precisely those in the connected component of the graph containing n.

The Depth-First Search is another method for finding all the nodes reachable from the starting node. Instead of visiting all the nodes layer by layer, however, this algorithm follows edges to unexplored nodes along one path until it reaches a dead end, at which point it will backtrack through all of its explored nodes until it finds one with an edge to an unexplored node. Now it will take that advance along the unexplored nodes in that direction until it again reaches a dead end and has to repeat the process. The algorithm will continue this until all the nodes are visited. As opposed to a Breadth-First Search where all the nodes connected by an edge must be no more than one layer apart, the Depth-First search can create trees with whose nodes are separated by more than one edge, but one of these must be a descendant of the other.

Qualitatively, these two algorithms will have roughly the same running times and efficiency since they will both be traversing the same number of nodes and edges. This is due to the fact that for two nodes in a graph, their connected components are either completely identical or disjoint (KT p. 86).

I found this chapter easily readable although a few of the proofs took a few minutes of digestion to fully comprehend.

Section 3.3 Implementing Graph Traversal Using Queues and Stacks

Breadth-First Search and Depth-First search often produce quite different trees, but their mechanics are very similar and in fact their essential difference is one's using a queue versus the other's using a stack.

Breadth-First Search works in O(m+n) time to assemble the Breadth-First search tree. It does this by making a List L[i] with i = 0, so tjis only contains the root node s. It then makes an empty list L[i + 1] and adds the undiscovered nodes enumerated in adjacency lists of each node in L[i] and marks those nodes as discovered. Finally, it increments i to advance the loop. This runs in O(n+m) because besides setting up the lists in n time, a for loop runs in m time considering each edge. It runs in m time because it must consider to degree of each node, or 2m.

Depth-First Search makes a stack which initially only contains the root node s. It then enters a loop which takes the first item off the stack, marks it as explored, adds all its adjacent nodes to the stack and repeats the loop until the stack is empty. This algorithm also runs in O(n+m) time since everything within the loop runs in constant time and the loop executes as many times as the sum of the degree of all the nodes in the graph, which is 2m.

The one thing I didn't understand about this chapter is the how a BFS could be implemented using a queue. The chapter promised that this was true but didn't really explain how, at least in my eyes.

Section 3.4 Testing Bipartiteness: an Application of BFS

A bipartite graph is one that can be separated into two sets such that every edge has one edge in X and the other in Y. This can be represented by a BFS tree by labelling all the odd layers as the layers which contain nodes in the X partition and the even layers contain nodes in the Y partition. Then it is easy to see that there can not be an edge that begins and ends in the same layer, and by the definition of a BFS, there can not be an edge that spans more than one layer difference from the layer in which it has one end. It follows from this then that a bipartite graph can not contain any odd cycles, since an odd cycle must contain an edge that begins and ends in the same layer of the BFS tree.

This lays down the basis for a very easy addition to be made to the BSF algorithm that will help determine whether a graph is bipartite or not. Simply add an array of size n called Partition and assign all the nodes n in an odd layer to be Partition[n] = X and all the nodes in even layers to be Partition[n] = Y. If any edge has both ends being Partition[n] = X or Partition[n] = Y, then the graph is bipartite. Since this all occurs in constant time, it does not bump up the O(n+m) running time of BFS.

This section was fascinating in that it demonstrated that determining bipartiteness of a graph is reliant on the same algorithmic skeleton as the BFS algorithm.

Section 3.5 Connectivity in Directed Graphs

First, directed graphs require two adjacency lists in order to be properly represented since an edge may go from node u to node v but not return from v to u. One lists represents the edges that leave a node and the nodes that those edges connect to, and the other list represents the edges which lead to a node and from which node they come. They are called the from list and the to list, respectively.

The graph search algorithms BFS and DFS perform almost the same with directed graphs as they do with undirected graphs. The major difference being that they use a node's from list to traverse through the nodes which the current node has an edge from itself to them. It is important to note that in a directed graph, a node n can have a path to node r, but node r is not required to have a path back to node n. The graph traversals are discovering nodes to which n has a path; these paths are not necessarily reciprocated.

It is interesting that these traversals are finding all the nodes to which node n has a path, but to find all the nodes with paths to n, the same traversal can be run on an identical graph with the directions of the edges reversed. This leads into a discussion of strong connectivity and mutual reachability.

A graph is strongly connected if all nodes u and v which have paths (u,v) also have paths (v,u). In other words, if all nodes are mutually reachable to one another. An interesting property of mutual reachability is that if two nodes are mutually reachable and a third node is mutually reachable to one of those two, then it is mutually reachable to the other also.

The strong component is an analog to the connected component in an undirected graph. The strong component in a graph G is the set of nodes n which are mutually reachable from the nodes s. The a set of nodes is a strong component if that set is returned by a BFS started at node n on graph G and also returned by a BFS started at node n on a graph G^rev (the graph G with the directions of the edges reversed.) The strong component also has an analogous property to the connected component of undirected graphs that it is identical or disjoint for two nodes.

Section 3.6 DAGS and Topological Orderings

A Directed Acyclic Graph (DAG) is just what it sounds like, a directed graph containing no cycles. These are useful for representing precedence relations and dependencies because they don't contain cycles. A cycle in a dependency relation would mean that one process could not start until another had taken place, which would never happen because none could start first.

These types of graphs have can also be represented by a structure known as a topological ordering, which the book defines as “an ordering of the nodes v₁,…,v_n such that for every edge (v_i,v_j), i<j”. Every topological ordering represents a DAG since a topological ordering must have directed edges of course, and these edges can not be in a cycle because eventually that would violate the condition that for every edge (v_i,v_j), i<j.

It is also true that every DAG has a topological ordering, since there must be a node that has no incoming edges due to the fact that there are no cycles in a DAG. Once this node has been removed and removing a node cannot create a cycle, then there is a DAG remaining with all the nodes that were in the previous DAG, except the removed node. This truth can be used to devise an algorithm that runs in O(n²) time:

1.find the node with no incoming edges

2.delete this node

3.recursively compute the topological ordering of the DAG without the removed node and add this after the deleted node.

There is yet a faster, linear-time method to computing the topological ordering of the DAG, but I could not quite grasp what this was. It dealt with active nodes and edges coming from those nodes into the current node.

Table of Contents