NLA MT25, Rayleigh-Ritz algorithm
Flashcards
@Describe the Rayleigh-Ritz @algorithm. State the algorithm, the inputs and outputs, why it works, and what it costs.
Setup: Given a symmetric $A \in \mathbb R^{n \times n}$ and an orthonormal matrix $Q \in \mathbb R^{n \times k}$, the Rayleigh-Ritz algorithm finds approximate eigenpairs of $A$ within $Q$ (i.e. the subspace spanned by the columns of $Q$). Note that $Q$ is supplied. The closer the subspace $Q$ is to the subspace containing the true eigenpairs, the better the approximation is.
Algorithm:
- Project: Compute $Q^\top A Q \in \mathbb R^{k \times k}$.
- Solve the small eigenproblem: Find the decomposition $Q^\top A Q = V\hat \Lambda V^\top$, where $V \in \mathbb R^{k \times k}$ is orthogonal and $\hat \Lambda = \text{diag}(\hat \lambda _ 1, \ldots, \hat \lambda _ k)$.
- Lift back:
- Return the Ritz values $\hat \lambda _ 1, \ldots, \hat \lambda _ k$ on the diagonal of $\hat \Lambda$, the approximate eigenvalues of $A$.
- Return the Ritz vectors: the columns of $QV \in \mathbb R^{n \times k}$, the approximate eigenvectors of $A$.
Given a symmetric $A \in \mathbb R^{n \times n}$ and an orthonormal matrix $Q \in \mathbb R^{n \times k}$, the Rayleigh-Ritz algorithm finds approximate eigenpairs of $A$ within $Q$ via:
- Project: Compute $Q^\top A Q \in \mathbb R^{k \times k}$.
- Solve the small eigenproblem: Find the decomposition $Q^\top A Q = V\hat \Lambda V^\top$, where $V \in \mathbb R^{k \times k}$ is orthogonal and $\hat \Lambda = \text{diag}(\hat \lambda _ 1, \ldots, \hat \lambda _ k)$.
- Lift back:
- Return the Ritz values $\hat \lambda _ 1, \ldots, \hat \lambda _ k$ on the diagonal of $\hat \Lambda$, the approximate eigenvalues of $A$.
- Return the Ritz vectors: the columns of $QV \in \mathbb R^{n \times k}$, the approximate eigenvectors of $A$.
@Describe the variational characterisation of these eigenpairs as somehow the best eigenpairs living in $Q$.
- Return the Ritz values $\hat \lambda _ 1, \ldots, \hat \lambda _ k$ on the diagonal of $\hat \Lambda$, the approximate eigenvalues of $A$.
- Return the Ritz vectors: the columns of $QV \in \mathbb R^{n \times k}$, the approximate eigenvectors of $A$.
By ∆courant-fischer-minmax-theorem applied to $Q^\top A Q$, we have
\[\hat \lambda _ i = \max _ {\mathcal T \subseteq \mathrm{range}(Q), \dim \mathcal T = i} \min _ {x \in \mathcal T \setminus \{0\}} \frac{x^\top A x}{x^\top x}.\]In particular, the largest Ritz value satisfies
\[\hat \lambda _ 1 = \max _ {x \in \mathrm{range}(Q) \setminus \{0\}} \frac{x^\top A x}{x^\top x}\]so the Ritz eigenpairs are the best Rayleigh-quotient maximiser within $\mathrm{range}(Q)$.
Bite-sized
For the Rayleigh-Ritz output to be exact (rather than approximate), $\mathrm{range}(Q)$ must be $A$-invariant. In that case the Ritz pairs are true eigenpairs of $A$ — this is what happens in the “happy breakdown” of Arnoldi/Lanczos.
Cauchy interlacing for Ritz values from a $k$-dim subspace: $\lambda _ {n - k + i}(A) \le \hat \lambda _ i \le $ $\lambda _ i(A)$. So Ritz values always sandwich between corresponding edges of $A$’s spectrum.
What is the Ritz residual $\|A \hat v _ i - \hat \lambda _ i \hat v _ i\| _ 2$ used for?
As a stopping criterion: when this residual falls below a tolerance, the Ritz pair $(\hat \lambda _ i, \hat v _ i)$ is accepted as a converged approximate eigenpair. For symmetric $A$, the residual is also a backward error for the eigenvalue: there exists a symmetric perturbation $E$ with $\|E\| _ 2 = \|A \hat v - \hat \lambda \hat v\| _ 2$ such that $\hat \lambda$ is an exact eigenvalue of $A + E$. Combined with Weyl’s inequality, $ \vert \hat \lambda - \lambda \vert \le \|E\| _ 2$.