Table of Contents
Chapter 7: Network Flow
En route to develop a polynomial time algorithm for finding the largest matching in a bipartite graph, we will formulate a generate class of problems–network flow problems–from which we can narrow down to bipartite matching.
7.1: Max-Flow Prob. and Ford-Fulkerson Algorithm
First, we define a flow network to be a directed graph in which:
- each edge e has an associated capacity, a nonnegative number denoted ce
- there exists a single source node s
- and a single sink node t.
All other nodes will be called internal nodes. Initial assumptions we make are that edges only lead out of the source and into the sink, and that capacities are only integers.
Now, we define an s-t flow to be a function f that maps each edge to a nonnegative real number, so f(e) is the flow carried by edge e. Then, we have the
- capacity condition: 0=<f(e)=<ce,
- conservation condition: for all internal nodes, the sum of flows into the node must be equal to the sum of flows out.
The value of a flow, v(f) is defined as the amount of flow generated at the source (the sum of the flows of all edges leaving s). This must be equal to the flow going into the sink by the conservation condition. Now, we define the maximum-flow problem to be finding how much flow to assign to each edge to have the maximal total flow value on the network.
Designing Algorithm
A general approach to figuring out to maximize flow is to push as much flow forward as possible, then backtrack to where there could be some split, then forward again, and so on. To do this, we define a residual graph Gf of graph G:
- node sets are same
- for every edge e in G, on which f(e)<ce, there are ce-f(e) leftover units of capacity we can push forward. So, we include that edge e with capacity of ce-f(e) going forward in Gf we will refer to this as the residual capacity.
- we also define then edge e' in Gf that points in the opposite, backward, direction, with the capacity that the flow was, f(e).
So, letting P be a simple path in Gf (not visit a node twice) from s to t, and defining bottleneck(P,f) to be the minimum residual capacity along P with respect to flow f the algorithm below yields a new flow:
augment(P,f):
- let b=bottleneck(P in Gf,f)
- for each edge (u,v) in P:
- if e=(u,v) is a forward edge:
- increase f(e) in G by b
- else 1):
- decrease f(e) in G by b
- return (f)
So, we can define new flows in G by performing this operation on the residual graph. We have that this new f' is a flow in G. Now, we can use this in the Ford-Fulkerson algorithm for max-flow:
Max-Flow
- initially f(e)=0 for all e in G
- while there is an s-t path in the residual graph Gf:
- let P be a simple s-t path in Gf
- f'=augment(P, f)
- update f to be the f'
- update the residual graph Gf to be Gf'
- return f
Assuming all our capacities are integers, they remain integers, and always the value of the flow increases by at least 1 on each iteration. Therefore, if C is the sum of all of the max flows out of s, then the algorithm terminates after at most C iterations of the while loop. Then, due to the max number of edges, and we have that one must be found each time (O(m)), we have that the entire algorithm runs in O(mC) time.
This was a good section and well explained. 9/10.
7.2: Max Flows and Min Cuts in a Network
Now, we define a cut (A,B) to be any partition of the vertex set V of G such that the source is in A and the sink in B. Then, the capacity of a cut, c(A,B), we define to be the sum of the capacities of all edges out of A. Consequently, the value of the flow, v(f)=fout(A)-fin(A). Of course, the same follows for the in and out respective values of B, and these all come from the fact that we cannot have flow accumulating somewhere. We then have the upper bound:
- v(f)=<c(A,B)
That is, the value of every flow is upper bounded by the capacity of every cut.
We then develop into the relation of the termination point of the algorithm–when there is no s-t path for flow f in the residual graph:
- if f is an s-t flow such that there is no s-t path in the residual graph, then there is an s-t cut (A*,B*) in G for which v(f)=c(A*,B*). Consequently, f has the maximum value of any flow in G, and (A*,B*) has the minimum capacity of any s-t cut in G.
Then, we have that:
- the flow returned by the FF algorithm is a maximum flow, and
- given a flow of max value, we can compute an s-t cut of min capacity in O(m) time.
All of this put together yields the max-flow min-cut theorem:
- in every flow network, the max value of an s-t flow is equal to the min capacity of an s-t cut.
The book further discusses real numbered capacities and flows, but we seem to veer away from that in class so we shall not dive into that rabbit hole here. A good section, 8/10.
7.5: First Application: Bipartite Matching
Given a bipartite graph, we can use the algorithm previously discussed to find the size of a maximum matching. So, creating a new graph G' we first direct all edges in G from X to Y (the bipartite thing), then add a node s with edges to each node in X, then a node t to which all nodes in Y have an edge going. Then, upon giving each edge in G' a capacity of 1 we can run the FF algorithm.
So, we have that the size of the max matching in G is equal to the value of the max flow in G', and the edges in such a matching are the edges that carry flow from X to Y in G'. So, it is shown in the book that we can use the FF algorithm to find a max matching in a bipartite graph in O(mn) time.
However, what if the bipartite graph does not have a perfect matching. Well, with a lovely long proof in the book we show that there is a subset A of X that matches with a subset Gamma(A) of Y that is smaller than A. This can also be found in O(mn) time.
This was a much shorter section, but touched on this very important connection and interweaving between the problem type that we worked with so early quite well. 9/10 without a doubt.
7.7: Extensions to the Max-Flow Problem
Just as we were able to apply the FF algorithm a problem type that seemed radically different, there are many more ways in which combinatoric search space can be taken advantage of by this method. A great extension from what we just worked with is instead moving to sets of sources and sinks S and T. Each node then has an associate demand dv, which, if positive, indicates that the node wants to receive that many more units of flow than it sends out (is a sink), and if negative, means it wants to sent out that many more than it receives (is a source).
We define a circulation to be the analog of a given flow f in the previous version: certain values assigned to each edge, the amount of stuff going through it at a certain time. Each edge still has a capacity, and there cannot be any more flow accumulating or leaving a node than its demand states. So, these are simply updated capacity and conservation (demand) conditions. We call a circulation feasible if there indeed exist flow values on each edge for which these conditions can be met. We have that if there exists a feasible circulation, then the sum of all demands of all nodes must be equal to 0. Then, we can define the analog of a graph flow value to be D, the sum of all the positive demands (or the absolute value of the sum of the negative demands, because they are equal).
We then have that:
- there is a feasible circulation in G if and only if the max s*-t* flow in G' has value D. If all capacities and demands are integers, and there is a feasible circulation, then the feasible circulation is integer valued.
- G has a feasible circulation if and only if for all cuts (A,B), the sum of all demands of nodes in B in less than or equal to the capacity of the cut (same definition as before).
Circulations with lower bounds
An added twist that we can toss in, is that each edge must have at least a certain amount of flow across it. As it happens, we can simply delete the lower bounds out of consideration and then consider it in the same manner as we did before. If there does not exist a feasible circulation in the new version, then there did not exist one in the lower bound version.
This section was a tad shorter but I also greatly appreciated the variety and extent of the conversation on extensions. 9/10.
