NLA MT25, Rayleigh-Ritz algorithm


Flashcards

@Describe the Rayleigh-Ritz @algorithm. State the algorithm, the inputs and outputs, why it works, and what it costs.

Setup: Given a symmetric $A \in \mathbb R^{n \times n}$ and an orthonormal matrix $Q \in \mathbb R^{n \times k}$, the Rayleigh-Ritz algorithm finds approximate eigenpairs of $A$ within $Q$ (i.e. the subspace spanned by the columns of $Q$). Note that $Q$ is supplied. The closer the subspace $Q$ is to the subspace containing the true eigenpairs, the better the approximation is.

Algorithm:

  1. Project: Compute $Q^\top A Q \in \mathbb R^{k \times k}$.
  2. Solve the small eigenproblem: Find the decomposition $Q^\top A Q = V\hat \Lambda V^\top$, where $V \in \mathbb R^{k \times k}$ is orthogonal and $\hat \Lambda = \text{diag}(\hat \lambda _ 1, \ldots, \hat \lambda _ k)$.
  3. Lift back:
    • Return the Ritz values $\hat \lambda _ 1, \ldots, \hat \lambda _ k$ on the diagonal of $\hat \Lambda$, the approximate eigenvalues of $A$.
    • Return the Ritz vectors: the columns of $QV \in \mathbb R^{n \times k}$, the approximate eigenvectors of $A$.

Given a symmetric $A \in \mathbb R^{n \times n}$ and an orthonormal matrix $Q \in \mathbb R^{n \times k}$, the Rayleigh-Ritz algorithm finds approximate eigenpairs of $A$ within $Q$ via:

  1. Project: Compute $Q^\top A Q \in \mathbb R^{k \times k}$.
  2. Solve the small eigenproblem: Find the decomposition $Q^\top A Q = V\hat \Lambda V^\top$, where $V \in \mathbb R^{k \times k}$ is orthogonal and $\hat \Lambda = \text{diag}(\hat \lambda _ 1, \ldots, \hat \lambda _ k)$.
  3. Lift back:
    • Return the Ritz values $\hat \lambda _ 1, \ldots, \hat \lambda _ k$ on the diagonal of $\hat \Lambda$, the approximate eigenvalues of $A$.
    • Return the Ritz vectors: the columns of $QV \in \mathbb R^{n \times k}$, the approximate eigenvectors of $A$.

@Describe the variational characterisation of these eigenpairs as somehow the best eigenpairs living in $Q$.

By ∆courant-fischer-minmax-theorem applied to $Q^\top A Q$, we have

\[\hat \lambda _ i = \max _ {\mathcal T \subseteq \mathrm{range}(Q), \dim \mathcal T = i} \min _ {x \in \mathcal T \setminus \{0\}} \frac{x^\top A x}{x^\top x}.\]

In particular, the largest Ritz value satisfies

\[\hat \lambda _ 1 = \max _ {x \in \mathrm{range}(Q) \setminus \{0\}} \frac{x^\top A x}{x^\top x}\]

so the Ritz eigenpairs are the best Rayleigh-quotient maximiser within $\mathrm{range}(Q)$.

Bite-sized

For the Rayleigh-Ritz output to be exact (rather than approximate), $\mathrm{range}(Q)$ must be $A$-invariant. In that case the Ritz pairs are true eigenpairs of $A$ — this is what happens in the “happy breakdown” of Arnoldi/Lanczos.

Source: Lecture 13, §13.6 of the lecture notes (post-Algorithm-13.2 discussion); also lecture 11, §11.3 for happy breakdown.

@bite~

Cauchy interlacing for Ritz values from a $k$-dim subspace: $\lambda _ {n - k + i}(A) \le \hat \lambda _ i \le $ $\lambda _ i(A)$. So Ritz values always sandwich between corresponding edges of $A$’s spectrum.

Source: Lecture 13, §13.6 of the lecture notes (Courant-Fischer applied to $Q^\top A Q$).

@bite~

What is the Ritz residual $\|A \hat v _ i - \hat \lambda _ i \hat v _ i\| _ 2$ used for?

As a stopping criterion: when this residual falls below a tolerance, the Ritz pair $(\hat \lambda _ i, \hat v _ i)$ is accepted as a converged approximate eigenpair. For symmetric $A$, the residual is also a backward error for the eigenvalue: there exists a symmetric perturbation $E$ with $\|E\| _ 2 = \|A \hat v - \hat \lambda \hat v\| _ 2$ such that $\hat \lambda$ is an exact eigenvalue of $A + E$. Combined with Weyl’s inequality, $ \vert \hat \lambda - \lambda \vert \le \|E\| _ 2$.

Source: Lecture 13, §13.6 Algorithm 13.2 of the lecture notes (residual mention in stopping criterion).

@bite~