Minimum Weight Spanning Tree
Given and a weight function , find a minimum cost spanning tree of . Recall the following lemma for Kruskal’s algorithm:
Lemma: Every edge is a minimum weight edge across some cut of .
Proof: Let . Initially and are in separate components. Since the algorithm outputs a tree, at some point it must have added the edge that connected the components containing and . Let be the component containing and the component containing in the step before was added. Clearly is a minimum weight edge connecting and by the definition of the greedy algorithm.
Theorem: Let be a minimum spanning tree and let be the tree output by the greedy algorithm. Then .
Proof: Let be an edge in that is not in . If such an edge doesn’t exist, then there is nothing to proof. Consider . Let be the unique cycle in . We claim that there is one edge in the cycle that is atleast as heavy as . By previous lemma there exists a cut such that is a minimum weight edge connecting . Clearly the cycle crosses at least twice and so there must be an edge such that . Let . Then . Thus we can keep replacing every edge of by the edges of , so that finally we obtain . Since was optimal the claim follows.
Caveat: What if at some point we remove an edge that we added from at some earlier iteration?
We claim that there exists a cut in for which is the minimum connecting edge in and moreover is the only edge in connecting with . This can be seen for example by setting to be the two components in . Thus when we remove an edge from , we know that the edge (the second cycle edge crossing ) cannot have come from .
- Initialize for some .
- While do:
- Let be the minimum cost edge with one endpoint in and the other endpoint in .
- Set to
Notice that the previous lemma that we proved holds for Prim’s algorithm, since if edge was added, then clearly it must be the minimum cost edge connecting .
Running Time for Kruskal’s Algorithm
Sorting the edges takes . If we naively check for cycles by doing , then each iteration would take and we do iterations. This would give us a running time bound of . Can we do better?
We can maintain a list of sets of vertices that form the components at each period of time. Thus checking whether adding an edge forms a cycle reduces to looking up the component numbers of and and checking whether they are the same (in which case you get a cycle) or different (in which case you don’t get a cycle). How much time does it take to modify the components when an edge is added? We only need to modify the entries of the smaller list. Observe that in this case, every time we do this modification step the set which we modified became at least twice in size. Thus the entry for a particular vertex can only be modified at most times, since the component can grow to at most size . Thus the total time to update the sets is .