# Feb 6. Greedy algorithms, Matroids.

When does greedy work?

Observe the following property of forests:

Lemma: Forests $F_1, F_2$ on a graph $G = (V, E)$ such that $|F_1| < |F_2|$ then $\exists e \in F_2$ such that $F_1 \cup \{e\}$ is a forest.

Proof: Notice that $F_1$ cannot be a tree and so it has atleast two components. Let $\{C_1, \cdots, C_k\}$ be the components of $F_1$. We claim that all the edge of $F_2$ cannot be contained in $\{ C_1, \cdots, C_k\}$ since each $C_i$ is a tree on $V(C_i)$ vertices. Since $|F_2| > |F_1|$, there must be an edge $e = (u,v) \in F_2$ and two components of $F_1$,  $C_i$ and $C_j$ such that $u \in C_i$ and $v \in C_j$. Thus $F_1 \cup \{e\}$ is still a forest.

Matroids

Let $U = \{e_1, \cdots, e_m\}$ be a universe and $\mathcal{I}$ be a collection of subsets of $U$. We say that $(U, \mathcal{I})$ is a matroid if:

1. $\phi \in \mathcal{I}$.
2. $A \in \mathcal{I}$ and $B \subseteq A$ $\Rightarrow$  $B \in \mathcal{I}$.
3. $A, B \in \mathcal{I}$ and $|A| < |B|$ $\Rightarrow$ $\exists e \in B$ such that $A \cup \{e\} \in \mathcal{I}$.

A set $I \in \mathcal{I}$ is called independent. Notice that the set $(E, \mathcal{F})$ where $\mathcal{F}$ is the set of all forests $F$ such that $F \subseteq E$ is a matroid.

Suppose $(U, \mathcal{I})$ is a matroid and we have a weight function $w : U \rightarrow \mathbb{R}$ and we want to find a maximum weight independent set. Consider the following greedy algorithm:

• Sort the elements of $U$ in decreasing order of their weight, so that $w(e_1) \ge w(e_2) \ge \cdots \ge w(e_m)$.
• Initialize $I = \{e_i\}$ where $e_i$ is the first element such that $I \in \mathcal{I}$.
• Iterate till you can no longer add any element while still remaining in $\mathcal{I}$.

Theorem: This algorithm gives us the maximum weight independent set in $(U, \mathcal{I})$.

Proof: We first observe that the size of the set $I$ returned by the algorithm is the same as the size of $I^*$, where $I^*$ is a maximum weight independent set in $\mathcal{I}$. Let the elements picked by greedy be $\{x_1, x_2, \cdots, x_k \}$ in order. Let the elements of $I^*$ (in decreasing order of weight) be $\{y_1, y_2, \cdots, y_k\}$. Suppose greedy fails; then there exists a $i$ such that $w(x_i) < w(y_i)$ since otherwise $w(I) \ge w(I^*)$ since $|I| = |I^*|$. We know that $X = \{x_1, \cdots, x_{i-1}\} \in \mathcal{I}$ and $Y = \{y_1, \cdots, y_i\}$. By the property of matroids, we get that there must be an element $y_j \in Y$ such that $X \cup \{y_j\} \in \mathcal{I}$. Since $w(y_j) \ge w(y_i) > w(x_i)$ which contradicts the choice of the greedy algorithm!

Notice that this same proof would have worked for proving the correctness of the greedy algorithm for finding a minimum (or maximum) weight spanning tree. Another example of matroid is $M = (U, \mathcal{I})$, where $U$ is a set of vectors in $\mathbb{R}^n$ and $\mathcal{I}$ is a collection of all linearly independent subsets of $U$. Thus the greedy algorithm would also work for finding a maximum weight linearly independent set of vectors.