Feb 4. Kruskal’s and Prim’s algorithms for MST

Minimum Weight Spanning Tree

Given G=(V,E) and a weight function w: E \rightarrow \mathbb{R}, find a minimum cost spanning tree of G. Recall the following lemma for Kruskal’s algorithm:

Lemma: Every edge e is a minimum weight edge across some cut (S,\bar{S}) of G.

Proof: Let e=(u,v). Initially u and v are in separate components. Since the algorithm outputs a tree, at some point it must have added the edge e that connected the components containing u and v. Let S be the component containing u and \bar{S} the component containing v in the step before e was added. Clearly e is a minimum weight edge connecting S and \bar{S} by the definition of the greedy algorithm.

Theorem: Let T^* be a minimum spanning tree and let T_G be the tree output by the greedy algorithm. Then w(T_G) = w(T^*).

Proof: Let e=(u,v) be an edge in T_G that is not in T^*. If such an edge doesn’t exist, then there is nothing to proof. Consider T^* \cup \{(u,v)\}. Let C be the unique cycle in T^* \cup \{(u,v)\}. We claim that there is one edge e' in the cycle that is atleast as heavy as (u,v). By previous lemma there exists a cut (S,\bar{S}) such that e is a minimum weight edge connecting (S,\bar{S}). Clearly the cycle C crosses (S,\bar{S}) at least twice and so there must be an edge e' \in T^* such that w(e') \ge w(e). Let T' = T^* \cup \{e\} \setminus \{e'\}. Then w(T') \le w(T^*). Thus we can keep replacing every edge of T^* by the edges of T_G, so that finally we obtain w(T_G) \le w(T^*). Since w(T^*) was optimal the claim follows.

Caveat: What if at some point we remove an edge e' that we added from T_G at some earlier iteration?

We claim that there exists a cut (S,\bar{S}) in G for which e is the minimum connecting edge in T_G and moreover e is the only edge in T_G connecting S with \bar{S}. This can be seen for example by setting (S,\bar{S}) to be the two components in T_G \setminus \{e\}. Thus when we remove an edge from T^* \cup \{e\}, we know that the edge e' (the second cycle edge crossing (S,\bar{S})) cannot have come from T_G.

Prim’s Algorithm

  • Initialize T = \{v\} for some v \in V.
  • While |T| < n-1 do:
    • Let e be the minimum cost edge with one endpoint in T and the other endpoint in \bar{T}.
    • Set T to T\cup \{e\}
  • EndWhile

Notice that the previous lemma that we proved holds for Prim’s algorithm, since if edge e was added, then clearly it must be the minimum cost edge connecting (T,\bar{T}).

Running Time for Kruskal’s Algorithm

Sorting the edges takes O(m \log{m}). If we naively check for cycles by doing DFS, then each iteration would take O(m+n) and we do n-1 iterations. This would give us a running time bound of O(mn). Can we do better?

We can maintain a list of sets of vertices that form the components at each period of time. Thus checking whether adding an edge e=(u,v) forms a cycle reduces to looking up the component numbers of u and v and checking whether they are the same (in which case you get a cycle) or different (in which case you don’t get a cycle). How much time does it take to modify the components when an edge e is added? We only need to modify the entries of the smaller list. Observe that in this case, every time we do this modification step the set S which we modified became at least twice in size. Thus the entry for a particular vertex v can only be modified at most \log_2{n} times, since the component can grow to at most size n. Thus the total time to update the sets is O(n\log{n}).

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s