NLA MT25, Courant-Fischer minmax theorem
Flashcards
@State the Courant-Fischer minmax theorem.
Suppose:
- $A \in \mathbb R^{n \times n}$ is a symmetric matrix
then:
\[\begin{aligned} \lambda _ i(A) &= \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S} \frac{x^\top A x}{x^\top x} \\ &= \min _ {\dim \mathcal S = n-i + 1} \max _ {x \in \mathcal S} \frac{x^\top A x}{x^\top x} \end{aligned}\]and analogously, if
- $A \in \mathbb C^{m \times n}$ where $m \ge n$
then:
\[\begin{aligned} \sigma _ i(A) &= \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} \\ &= \min _ {\dim \mathcal S = n - i + 1} \max _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} \end{aligned}\](here, $\max _ {\dim \mathcal S = i}$ means that we’re taking the maximum over all possible subspaces $\mathcal S$ of dimension $i$).
@Prove the ∆courant-fischer-minmax-theorem, i.e. that if
- $A \in \mathbb R^{n \times n}$ is a symmetric matrix
then:
\[\begin{aligned}
\lambda _ i(A) &= \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S} \frac{x^\top A x}{x^\top x} & (1) \\
&= \min _ {\dim \mathcal S = n-i + 1} \max _ {x \in \mathcal S} \frac{x^\top A x}{x^\top x}
\end{aligned}\]
and furthermore, if
- $A \in \mathbb C^{m \times n}$ where $m \ge n$
then:
\[\begin{aligned}
\sigma _ i(A) &= \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} &\quad (2) \\
&= \min _ {\dim \mathcal S = n - i + 1} \max _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2}
\end{aligned}\]
Proof of (1): Let an orthogonal diagonalisation of $A$ be $Q \Lambda Q^\top$. Fix any subspace $\mathcal S$ and let $Q _ i = [q _ i, \ldots, q _ n]$. Then
\[\dim(\mathcal S) + \dim(\text{span}(Q _ i)) = i + (n - i + 1) = n + 1\]Since this quantity is bigger than $n$, it follows there exists some non-zero $w \in \mathcal S \cap Q _ i$, and in particular we may choose $ \vert \vert w \vert \vert _ 2 = 1$. Then for this $w$,
\[w^\top A w \le \lambda _ 1 w^\top w\]and thus
\[\min _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2}\]Proof of (2):
Let the SVD of $A$ be $A = U \Sigma V^\top$. Fix any subspace $\mathcal S$ and let $V _ i = [v _ i, \ldots, v _ n]$. Then
\[\dim(\mathcal S) + \dim(\text{span}(V _ i)) = i + (n - i + 1) = n+1\]Since this quantity is bigger than $n$, it follows there exists some non-zero $w \in S \cap V _ i$, and in particular we may choose $ \vert \vert w \vert \vert _ 2 = 1$.
Then for this $w$,
\[\vert \vert Aw \vert \vert _ 2 = \sigma _ 1 \le \sigma _ i\]by ∆2-norm-bounded-above-by-smallest-singular-value. Hence
\[\sigma _ i \ge \min _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2}\]To see the reverse inequality, take $\mathcal S = [v _ 1, \ldots, v _ i]$ where $w = v _ i$.
@Prove that for a matrix $A$,
\[\sigma _ 1 = \max _ {x} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2}\]
and
\[\sigma _ n = \min _ {x} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} = \min _ { \vert \vert x \vert \vert _ 2 = 1} \vert \vert Ax \vert \vert _ 2\]
by appealing to the ∆courant-fischer-minmax-theorem (recall here that $\sigma _ 1$ is the largest singular value, and $\sigma _ n$ is the smallest).
Recall the CF theorem states that
\[\begin{aligned} \sigma _ i(A) &= \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} &\quad (2) \\ &= \min _ {\dim \mathcal S = n - i + 1} \max _ {x \in \mathcal S} \frac{ \vert \vert Ax \vert \vert _ 2}{ \vert \vert x \vert \vert _ 2} \end{aligned}\]then plug in $i = 1$ and $i = n$.
Applications
See also [[Notes - NLA MT25, Weyl’s inequality]]U.
Suppose $A _ 1, A _ 2$ are matrices. @State a result about $\sigma _ i\left(\begin{bmatrix}A _ 1 \ A _ 2\end{bmatrix}\right)$ and $\sigma _ i(\begin{bmatrix}A _ 1 & A _ 2\end{bmatrix})$.
and
\[\sigma _ i(\begin{bmatrix}A _ 1 & A _ 2\end{bmatrix}) \ge \max(\sigma _ i(A _ 1), \sigma _ i(A _ 2))\]@Prove that if $A _ 1, A _ 2$ are matrices, then
\[\sigma _ i\left(\begin{bmatrix}A _ 1 \\ A _ 2\end{bmatrix}\right) \ge \max(\sigma _ i (A _ 1), \sigma _ i(A _ 2)) \quad\quad (1)\]
and
\[\sigma _ i(\begin{bmatrix}A _ 1 & A _ 2\end{bmatrix}) \ge \max(\sigma _ i(A _ 1), \sigma _ i(A _ 2)) \quad\quad (2)\]
Proof of (1):
The left-hand side is:
\[\text{LHS} = \max _ {\dim \mathcal S = i} \min _ {x \in \mathcal S, \vert \vert x \vert \vert _ 2 = 1} \left \vert \left \vert \begin{bmatrix}A _ 1 \\ A _ 2\end{bmatrix} x\right \vert \right \vert\]and for any $x$,
\[\left \vert \left \vert \begin{bmatrix}A _ 1 \\ A _ 2\end{bmatrix} x\right \vert \right \vert _ 2 \ge \max( \vert \vert A _ 1 x \vert \vert _ 2, \vert \vert A _ 2 x \vert \vert _ 2)\]Proof of (2):
The left-hand side is:
\[\text{LHS} = \max _ {\dim \mathcal S = i} \min _ {\begin{bmatrix}x _ 1 \\ x _ 2\end{bmatrix} \in \mathcal S, \vert \vert \begin{bmatrix}x _ 1 \\ x _ 2\end{bmatrix} \vert \vert = 1} \left \vert \left \vert \begin{bmatrix}A _ 1 & A _ 2\end{bmatrix} \begin{bmatrix}x _ 1 \\ x _ 2\end{bmatrix}\right \vert \right \vert _ 2\]while
\[\sigma _ i(A _ 1) = \max _ {\substack{\dim S = i, \\ \mathrm{range}(S) \in \mathrm{range}\begin{bmatrix} I _ n \\ 0 \end{bmatrix}}} \min _ {\substack{\begin{bmatrix} x _ 1 \\ x _ 2 \end{bmatrix} \in S, \\ \left\| \begin{bmatrix} x _ 1 \\ x _ 2 \end{bmatrix} \right\| _ 2 = 1}} \left\| \begin{bmatrix} A _ 1 & A _ 2 \end{bmatrix} \begin{bmatrix} x _ 1 \\ x _ 2 \end{bmatrix} \right\| _ 2 .\]Since the latter imposes restrictions on $\mathcal S$ to take the maximum over, the former is at least as big.