Numerical Analysis HT24, QR decomposition
Flashcards
qr-factorisation-definitionWhat is the (full) $QR$ factorisation of a matrix $A \in \mathbb R^{m \times n}$?
\(A = QR\) where $Q$ is orthogonal and $R$ is upper-triangular.
qr-factorisation-types-detailedSuppose we have a matrix $A \in \mathbb R^{m \times n}$. By breaking down the cases $m \le n$ and $m \ge n$, what are all the possible types of QR factorisation, and what are the dimensions of the matrices involved?
“Fat QR”: When $m \le n$, $A = QR$, $Q \in \mathbb R^{m \times m}$ and $R \in \mathbb R^{m \times n}$:

When $m \ge n$,
“Full QR”: $A = QR$, $Q \in \mathbb R^{m \times m}$ and $R \in \mathbb R^{m \times n}$, $Q$ is made to be square (and hence $Q$ is an orthogonal matrix):

“Thin QR”: $A = QR$, $Q \in \mathbb R^{m \times n}$, $R \in \mathbb R^{n \times n}$, $Q$ is made to be the same size as the original matrix (and hence $Q$ has orthogonal columns, but isn’t an orthogonal matrix):

full-vs-thin-qr-relationshipSuppose we have a matrix $A \in \mathbb R^{m \times n}$ where $m > n$ (i.e. it is a tall and skinny matrix). What is the difference between the full QR and the thin QR, and how are they related?
For the full QR, we have a factorisation
\[A = QR = \begin{bmatrix}Q _ 1 & Q _ 2\end{bmatrix} \begin{bmatrix}R \\\\ 0\end{bmatrix}\]where $Q$ is an $m \times m$ matrix, and $R$ is a $m \times n$ matrix (i.e. $Q$ is made to be square, so it’s orthogonal).
In a thin QR, we have
\[A = Q'R'\]where $Q’$ is the same size as $A$, namely $m \times n$, and $R’$ is $n \times n$.
They are related by the fact that
\[A = Q _ 1 R\]is the thin QR factorisation.
q-perp-notationThe full QR of an $m \times n$ matrix $A$ is sometimes written
\[A = \begin{bmatrix}Q & Q _ \perp\end{bmatrix} \begin{bmatrix}R \\\\ 0\end{bmatrix}\]
Can you explain the notation $Q _ \perp$?
$Q _ \perp$ is an “orthogonal extension” to $Q$, so all of its columns are orthogonal to the columns of $Q$.
why-thin-q-not-orthogonalSuppose we have a thin QR of an $m \times n$ matrix $A$ (so $m > n$):
\[A = QR\]
why is not accurate to describe $Q$ as an orthogonal matrix?
Because orthogonal matrices are square matrices satisfying the property $Q^\top Q = I$, but $Q$ here is an $m \times n$ matrix with orthonormal columns.
q-transpose-q-equals-identitySuppose $Q$ is an $m \times n$ matrix with orthonormal columns. How can you get the identity from $Q$?
\(Q^\top Q = I\) but \(QQ^\top \ne I\) in general.
square-q-qq-transpose-equals-identitySuppose that $Q$ is a $m \times m$ matrix (so it is square) with orthonormal columns. Is it true that $QQ^\top = I$?
Yes. (Though this is not true if $Q$ is not square).
qr-factorisation-uniquenessQuickly prove that the QR decomposition of an invertible matrix $A \in \mathbb R^{n \times n}$ is unique.
Suppose $A = Q _ 1 R _ 1$ and $A = Q _ 2 R _ 2$. Then \(Q _ 2^{-1}Q _ 1 = R _ 2 R _ 1^{-1}\) Since $R _ 2^{-1}$ is also upper triangular, $R _ 2^{-1} R _ 1$ is upper triangular. But $Q _ 2^{-1} Q _ 1$ is also orthogonal, so both of these matrices must be upper-triangular and orthogonal. But there is only one upper-triangular and orthogonal matrix, the identity ($\star$), so we are done.
($\star$: This is not immediate and needs proof).
Via Householder
See also [[Notes - Numerical Analysis HT24, Householder reflectors]]U.
qr-with-householderHow can Householder matrices $H(w)$ be used to find a QR factorisation of a matrix?
We can turn the first column of a matrix into a vector like \(\begin{pmatrix}\alpha \\ 0 \\\\ \vdots \\ 0\end{pmatrix}\) Continuing by induction and using block matrices like \(\begin{pmatrix}1 & a _ {12} &\cdots & a _ {1n} \\ 0 & & H(w) & \\ 0\end{pmatrix}\) we can annihilate all the sub-diagonal entries.
Via Gram-Schmidt
qr-with-gram-schmidtHow could you compute the QR factoristion of a matrix $A$ using the Gram-Schmidt process?
Apply the Gram-Schmidt process to the columns of $A$. This gives the columns of $Q$. Then $R$ can either be calculated by keeping track of the row operations being performed or using the identity that \(R = Q^\top A\)
Bite-sized
For full-rank $A$, the (thin) QR factorisation $A = QR$ is unique once we fix the signs of the diagonal entries of $R$ (e.g. requiring them positive). Without that constraint there are $2^n$ valid choices.
Source: Existing card ^qr-factorisation-uniqueness in this entry; also Problem Sheet 2, Q4(c) of NLA MT25.
@bite~
bite-na-householder-vs-gram-schmidtWhy is the Householder QR preferred over Gram-Schmidt for computing a QR factorisation in practice?
Classical Gram-Schmidt loses orthogonality in floating-point: $|\hat Q^\top \hat Q - I| _ 2 \le \epsilon \kappa _ 2(A)^2$, so for ill-conditioned $A$ the computed $\hat Q$ can be far from orthogonal. Householder QR is unconditionally backward stable: $|\hat Q^\top \hat Q - I| _ 2 = O(\epsilon)$, regardless of $\kappa _ 2(A)$. The cost is essentially the same.
Source: NLA MT25 cards ^householder-qr-backward-stability-statement and §7.7 of the NLA lecture notes.
@bite~
For $A \in \mathbb R^{m \times n}$ with $m \ge n$, the full QR has $Q \in \mathbb R^{m \times m}$ and $R \in \mathbb R^{m \times n}$ (zeros below row $n$). The thin QR has $Q \in \mathbb R^{m \times n}$ and $R \in $ $\mathbb R^{n \times n}$. They are related by $A = QR = $ $Q _ 1 R _ {\text{thin$}} where $Q _ 1$ is the first $n$ columns of the full $Q$.
Source: Existing card ^full-vs-thin-qr-relationship in this entry; also Lecture 6, §6.3 of NLA MT25 lecture notes.
@bite~