# Feb 2. Strongly connected components, Minimum Spanning Tree

Strongly Connected Components

Recall the algorithm that we proposed last time for finding the strongly connected components of a graph $D=(V,A)$. Let $\mathcal{C} = \{C_1, C_2, \dots , C_k\}$ be the strongly connected components of $D$. Let $D'=(V',A')$ be the graph with vertex set $V'= \mathcal{C}$ and $(C_i,C_j) \in A'$ if there exists $u \in C_i$ and $v \in C_j$ such that $(u,v) \in A$.

We claim that $D'$ has no directed cycles, in other words $D'$ is a directed acyclic graph (often shortened to dag). Suppose not and there is a directed cycle from $C_i$ to $C_j$. Notice that this means that there is a directed path in $D$ from every vertex in $C_i$ to every vertex in $C_j$ and vice versa. This contradicts the hypothesis that $C_i$ and $C_j$ are distinct strongly connected components.

Let us consider the algorithm again:

• Call $DFS$ on $D$ from an arbitrary vertex.
• Number vertices according to the order in which $DFS$ finishes examining all out edges from that vertex. This number for vertex $v$ is also referred to as $post(v)$.
• Let $D^R$ be $D$ with all arcs reversed. Call $DFS$ on $D^R$ in decreasing order of their $post$ numbers.
• Output the explored vertices whenever $DFS$ has to start at a new vertex.

We first observe that the strongly connected components of $D^R$ is precisely $\mathcal{C} = \{C_1, C_2, \dots, C_k\}$, i.e. the same as the strongly connected components of $D$.

We also observe that if $(C_i,C_j) \in A'$ then the highest $post$ number in $C_i$ is larger than the highest $post$ number in $C_j$. If the $DFS$ visits a vertex of $C_i$ before visiting any vertex of $C_j$  then clearly every vertex of $C_j$ will be explored before returning. Thus the highest $post$ number in $C_i$ will be larger than the highest $post$ number in $C_j$. If the $DFS$ visits a vertex of $C_j$ first before visiting any vertex of $C_i$ then clearly all of $C_j$ will be explored and no vertex of $C_i$ will be explored when $DFS$ returns. Thus we need to call $DFS$ on a vertex of $C_i$ at some later point in time and so the claim follows.

Thus the vertex with the highest $post$ number must be a source strongly connected component in $D$ and so a sink strongly connected component in $D^R$. Thus if we start $DFS$ in the order of decreasing $post$ number then from the above observation we will find exactly the strongly connected components of $D$.

Note:  At first glance it might appear that we could also run the $DFS$ algorithm on $D$ in the order of the lowest $post$ numbers. However it is not true that if $(C_i, C_j) \in A'$ then the lowest post number in $C_j$ is smaller than the lowest $post$ number in $C_i$. An easy counter example is when $C_i$ is a cycle $\{1,2, \cdots, k\}$ and $C_j$ is a single vertex $v$. Let $(i,v) \in A$ where $1 \le i < k$. Suppose $DFS$ was called on $1$ and the order of traversal was $1, 2, 3, \cdots, k$ before returning back to vertex $i$ and then exploring $v$. Since $i < k$, clearly vertex $i+1$ has post number strictly smaller than $v$.

Minimum Spanning Tree

Given a weighted un-directed graph $G=(V,E)$ with weights $w : E \rightarrow \mathbb{R}$. We wish to compute a minimum weight spanning tree of $G$. Here is a greedy algorithm that solves this problem:

• Initialize $T = \phi$.
• Order the edges as $e_1, e_2, \cdots, e_m$ such that $w(e_1) \le w(e_2) \le \cdots \le w(e_m)$.
• Pick the lowest weight edge $e_i$ such that $T\cup \{e_i\}$ has no cycles.
• Set $T = T \cup \{e_i\}$.
• Stop when $|T| = n-1$.

This algorithm is called Kruskal’s algorithm and it produces a minimum weight spanning tree.

Lemma: Every edge $e$ is a minimum weight edge in some cut $(S, \bar{S})$ in $G$.