NLA MT25, QR algorithm
When I took this course, this section was non-examinable. My previous notes on the QR algorithm can be found here:
Notes
- If $A$ is symmetric:
- The reduction to Hessenberg form also makes $A$ tridiagonal (although in my previous notes, we only analyse the QR algorithm under the assumption $A$ is symmetric)
- The QR steps for a tridiagonal matrix require only $O(n)$ steps rather than $O(n^2)$ steps.
- There are efficient algorithms for tridiagonal eigenvalue problems.
- Another possibility is using the Nakatsukasa-Freund-Higham algorithm
- Connections to computing the SVD:
- The SVD decomposition is closely related to the eigenvalue decomposition of symmetric matrices (since the singular values of $A$ are the square roots of the eigenvalues of $A^\top A$)
- However in the SVD algorithm, $U$ and $V$ are allowed to be different so you can modify the algorithm to save some of the work that was used to make sure the matrices pre- and post- multiplying the eigenvalue matrix were the same
- One popular algorithm for SVD inspired by the QR algorithm is Golub-Kahan
- Generalised Notes - NLA MT25, Eigenvalue problemsU
- These can be solved via a variant of the QR algorithm called the QZ algorithm.
Bite-sized
(All bite cards in this entry are marked **@nonexaminable**~ — the QR algorithm was non-examinable in MT25, but the lecture notes do cover it.)
The QR algorithm for $Ax = \lambda x$ iterates: factor $A _ k = Q _ k R _ k$, then set $A _ {k+1} = $ $R _ k Q _ k$ (QR-then-swap). The iterates are similar to $A$: $A _ {k+1} = Q _ k^\top A _ k Q _ k$, so the spectrum is preserved.
Why does the QR algorithm drive $A _ k$ to upper-triangular form?
Each QR step is implicitly running both a power-method-like and an inverse-power-method-like step under the hood. The first column of $Q^{(k)} = Q _ 1 \cdots Q _ k$ converges to the dominant eigenvector $v _ 1$, and the last column of $Q^{(k)}$ converges to the smallest eigenvector $v _ n$. So $A _ k(:,1) \to [\lambda _ 1, 0, \ldots, 0]^\top$ and $A _ k(n,:) \to [0, \ldots, 0, \lambda _ n]$. Iterating, all subdiagonal entries are driven to zero, producing the Schur form.
The shifted QR algorithm uses shifts $s _ k$ (typically $A _ {nn}$ or the Wilkinson shift): factor $A _ k - s _ k I = Q _ k R _ k$, then $A _ {k+1} = R _ k Q _ k + s _ k I$. With Hessenberg preprocessing, the total cost is $\approx $ $25 n^3$ flops, with empirically $2$–$4$ iterations per eigenvalue.
For symmetric $A$, Hessenberg reduction makes the matrix tridiagonal, and each QR step costs only $O(n)$ (vs $O(n^2)$ for non-symmetric Hessenberg). Total cost: $\tfrac{4}{3} n^3$ for eigenvalues alone, $\approx 10 n^3$ also including eigenvectors.
Why does the QR algorithm need Hessenberg reduction as preprocessing?
Without it, each QR step costs $O(n^3)$ (general QR factorisation of a dense matrix). With one-time Hessenberg reduction ($O(n^3)$), the Hessenberg structure is preserved by QR steps, and each step then costs only $O(n^2)$ via Givens rotations. Over the empirical $O(n)$ iterations needed for convergence, the total drops from $O(n^4)$ to $O(n^3)$.