NLA MT25, QR algorithm


When I took this course, this section was non-examinable. My previous notes on the QR algorithm can be found here:

Notes

  • If $A$ is symmetric:
    • The reduction to Hessenberg form also makes $A$ tridiagonal (although in my previous notes, we only analyse the QR algorithm under the assumption $A$ is symmetric)
    • The QR steps for a tridiagonal matrix require only $O(n)$ steps rather than $O(n^2)$ steps.
    • There are efficient algorithms for tridiagonal eigenvalue problems.
    • Another possibility is using the Nakatsukasa-Freund-Higham algorithm
  • Connections to computing the SVD:
    • The SVD decomposition is closely related to the eigenvalue decomposition of symmetric matrices (since the singular values of $A$ are the square roots of the eigenvalues of $A^\top A$)
    • However in the SVD algorithm, $U$ and $V$ are allowed to be different so you can modify the algorithm to save some of the work that was used to make sure the matrices pre- and post- multiplying the eigenvalue matrix were the same
    • One popular algorithm for SVD inspired by the QR algorithm is Golub-Kahan
  • Generalised Notes - NLA MT25, Eigenvalue problemsU
    • These can be solved via a variant of the QR algorithm called the QZ algorithm.

Bite-sized

(All bite cards in this entry are marked **@nonexaminable**~ — the QR algorithm was non-examinable in MT25, but the lecture notes do cover it.)

The QR algorithm for $Ax = \lambda x$ iterates: factor $A _ k = Q _ k R _ k$, then set $A _ {k+1} = $ $R _ k Q _ k$ (QR-then-swap). The iterates are similar to $A$: $A _ {k+1} = Q _ k^\top A _ k Q _ k$, so the spectrum is preserved.

Source: Lecture 9, QR algorithm for eigenproblems slide and §9.1 Algorithm 9.1 of the lecture notes.

@bite~ @nonexaminable~

Why does the QR algorithm drive $A _ k$ to upper-triangular form?

Each QR step is implicitly running both a power-method-like and an inverse-power-method-like step under the hood. The first column of $Q^{(k)} = Q _ 1 \cdots Q _ k$ converges to the dominant eigenvector $v _ 1$, and the last column of $Q^{(k)}$ converges to the smallest eigenvector $v _ n$. So $A _ k(:,1) \to [\lambda _ 1, 0, \ldots, 0]^\top$ and $A _ k(n,:) \to [0, \ldots, 0, \lambda _ n]$. Iterating, all subdiagonal entries are driven to zero, producing the Schur form.

Source: Lecture 9, §9.1 of the lecture notes (“QR algorithm and power method” + “QR algorithm and inverse power method”).

@bite~ @nonexaminable~

The shifted QR algorithm uses shifts $s _ k$ (typically $A _ {nn}$ or the Wilkinson shift): factor $A _ k - s _ k I = Q _ k R _ k$, then $A _ {k+1} = R _ k Q _ k + s _ k I$. With Hessenberg preprocessing, the total cost is $\approx $ $25 n^3$ flops, with empirically $2$–$4$ iterations per eigenvalue.

Source: Lecture 9, QR algorithm with shifts slide and §9.1-9.2 of the lecture notes.

@bite~ @nonexaminable~

For symmetric $A$, Hessenberg reduction makes the matrix tridiagonal, and each QR step costs only $O(n)$ (vs $O(n^2)$ for non-symmetric Hessenberg). Total cost: $\tfrac{4}{3} n^3$ for eigenvalues alone, $\approx 10 n^3$ also including eigenvectors.

Source: Lecture 10, QR algorithm for symmetric $A$ slide and §10.1 of the lecture notes.

@bite~ @nonexaminable~

Why does the QR algorithm need Hessenberg reduction as preprocessing?

Without it, each QR step costs $O(n^3)$ (general QR factorisation of a dense matrix). With one-time Hessenberg reduction ($O(n^3)$), the Hessenberg structure is preserved by QR steps, and each step then costs only $O(n^2)$ via Givens rotations. Over the empirical $O(n)$ iterations needed for convergence, the total drops from $O(n^4)$ to $O(n^3)$.

Source: Lecture 9, QR algorithm preprocessing: Hessenberg reduction slide and §9.2 of the lecture notes.

@bite~ @nonexaminable~