====== Chapter 7: Network Flow ====== En route to develop a polynomial time algorithm for finding the largest matching in a bipartite graph, we will formulate a generate class of problems--network flow problems--from which we can narrow down to bipartite matching. ===== 7.1: Max-Flow Prob. and Ford-Fulkerson Algorithm ===== First, we define a flow network to be a directed graph in which: * each edge //e// has an associated capacity, a nonnegative number denoted //c_e// * there exists a single source node //s// * and a single sink node //t//. All other nodes will be called internal nodes. Initial assumptions we make are that edges only lead out of the source and into the sink, and that capacities are only integers. Now, we define an //s-t flow// to be a function //f// that maps each edge to a nonnegative real number, so //f(e)// is the flow carried by edge //e//. Then, we have the * capacity condition: //0=e//, * conservation condition: for all internal nodes, the sum of flows into the node must be equal to the sum of flows out. The value of a flow, //v(f)// is defined as the amount of flow generated at the source (the sum of the flows of all edges leaving //s//). This must be equal to the flow going into the sink by the conservation condition. Now, we define the maximum-flow problem to be finding how much flow to assign to each edge to have the maximal total flow value on the network. ==== Designing Algorithm ==== A general approach to figuring out to maximize flow is to push as much flow forward as possible, then backtrack to where there could be some split, then forward again, and so on. To do this, we define a residual graph //G_f// of graph //G//: * node sets are same * for every edge //e// in //G//, on which //f(e)e//, there are //c_e-f(e)// leftover units of capacity we can push forward. So, we include that edge //e// with capacity of //c_e-f(e)// going forward in //G_f// we will refer to this as the residual capacity. * we also define then edge //e'// in //G_f// that points in the opposite, backward, direction, with the capacity that the flow was, //f(e)//. So, letting //P// be a simple path in //G_f// (not visit a node twice) from //s// to //t//, and defining bottleneck(P,f) to be the minimum residual capacity along P with respect to flow f the algorithm below yields a new flow: augment(P,f): * let b=bottleneck(P in //G_f//,f) * for each edge (u,v) in P: * if e=(u,v) is a forward edge: * increase f(e) in G by b * else ((u,v) is backward edge, let e=(v,u)): * decrease f(e) in G by b * return (f) So, we can define new flows in G by performing this operation on the residual graph. We have that this new f' is a flow in G. Now, we can use this in the Ford-Fulkerson algorithm for max-flow: Max-Flow * initially f(e)=0 for all e in G * while there is an s-t path in the residual graph //G_f//: * let P be a simple s-t path in //G_f// * f'=augment(P, f) * update f to be the f' * update the residual graph //G_f// to be //G_f'// * return f Assuming all our capacities are integers, they remain integers, and always the value of the flow increases by at least 1 on each iteration. Therefore, if //C// is the sum of all of the max flows out of //s//, then the algorithm terminates after at most //C// iterations of the while loop. Then, due to the max number of edges, and we have that one must be found each time (O(m)), we have that the entire algorithm runs in //O(mC)// time. This was a good section and well explained. 9/10. ===== 7.2: Max Flows and Min Cuts in a Network ===== Now, we define a cut (A,B) to be any partition of the vertex set V of G such that the source is in A and the sink in B. Then, the capacity of a cut, c(A,B), we define to be the sum of the capacities of all edges out of A. Consequently, the value of the flow, //v(f)=f^out(A)-fⁱⁿ(A)//. Of course, the same follows for the in and out respective values of B, and these all come from the fact that we cannot have flow accumulating somewhere. We then have the upper bound: * //v(f)=v//, which, if positive, indicates that the node wants to receive that many more units of flow than it sends out (is a sink), and if negative, means it wants to sent out that many more than it receives (is a source). We define a circulation to be the analog of a given flow f in the previous version: certain values assigned to each edge, the amount of stuff going through it at a certain time. Each edge still has a capacity, and there cannot be any more flow accumulating or leaving a node than its demand states. So, these are simply updated capacity and conservation (demand) conditions. We call a circulation feasible if there indeed exist flow values on each edge for which these conditions can be met. We have that if there exists a feasible circulation, then the sum of all demands of all nodes must be equal to 0. Then, we can define the analog of a graph flow value to be //D//, the sum of all the positive demands (or the absolute value of the sum of the negative demands, because they are equal). We then have that: * there is a feasible circulation in G if and only if the max s*-t* flow in G' has value D. If all capacities and demands are integers, and there is a feasible circulation, then the feasible circulation is integer valued. * G has a feasible circulation if and only if for all cuts (A,B), the sum of all demands of nodes in B in less than or equal to the capacity of the cut (same definition as before). === Circulations with lower bounds === An added twist that we can toss in, is that each edge must have //at least// a certain amount of flow across it. As it happens, we can simply delete the lower bounds out of consideration and then consider it in the same manner as we did before. If there does not exist a feasible circulation in the new version, then there did not exist one in the lower bound version. This section was a tad shorter but I also greatly appreciated the variety and extent of the conversation on extensions. 9/10.