Notes - Numerical Analysis HT24, Householder reflectors


Flashcards

Can you define the Householder reflector matrix $H(w) \in \mathbb R^{n \times n}$?


\[H(w) = I - \frac{2}{w^\top w} w w^\top\]

Quickly justify that the Householder reflector matrix

\[H(w) = I - \frac{2}{w^\top w} w w^\top\]

is symmetric and orthogonal.


Clearlly symmetric. Then

\[\begin{aligned} H(w)^\top H(w) &= \left(I - \frac{2}{w^\top w} w w^\top\right)\left(I - \frac{2}{w^\top w} w w^\top\right) \\\\ &= I - \frac{4}{w^\top w} w w^\top + \frac{4}{(w^\top w)^2} w(w^\top w) w^\top \\\\ &= I \end{aligned}\]

Quickly prove that given any $u \in \mathbb R^n$, there exists $w \in \mathbb R^n$ such that

\[H(w) u = \begin{pmatrix}\pm \sqrt{u^\top u} \\\\ 0 \\\\ \vdots \\\\ 0\end{pmatrix}\]

where $H(w)$ is the Householder reflector matrix defined as

\[H(w) = I - \frac{2}{w^\top w} w w^\top\]

Let $v$ be the above vector. Take any $w = \gamma(u - v)$ where $\gamma \ne 0$. Then

\[\begin{aligned} H(w) u &= \left(I - \frac{2}{w^\top w} w w^\top\right)u \\\\ &= u - \frac{2w^\top u}{w^\top w} w \end{aligned}\]

Now we aim to show that $w^\top w = 2\gamma w^\top u$. Note that $v^\top v = u^\top u$ (since Householder reflectors are orthogonal, so preserve lengths)

\[\begin{aligned} w^\top w &= \gamma^2 (u - v)^\top (u - v) \\\\ &= \gamma^2(u^\top u - 2u^\top v + v^\top v) \\\\ &= \gamma^2(2u^\top u - 2u^\top v) \\\\ &= 2\gamma u^\top (\gamma(u - v)) \\\\ &= 2\gamma u^\top w \\\\ &= 2\gamma w^\top u \end{aligned}\]

Hence

\[H(w) u = u - \frac 1 \gamma w = u - (u - v) = v\]

Alternative proof (slightly more messy but at least it’s all in one step): Let $w = \gamma(v - u)$ as before. Then

\[\begin{aligned} H(w)u &= \left(I - \frac{2\gamma(v - u)\gamma(v-u)^\top}{\gamma(v- u)^\top (v-u)}\right)u \\\\ &= \left(I - \frac{2(v-u)(v^\top - u^\top)}{(v^\top -u^\top)(v -u)}\right) u \\\\ &= \left(I - \frac{2(v v^\top - vu^\top - uv^\top + uu^\top)}{v^\top v - 2v^\top u + u^\top u}\right)u \\\\ &= u - \frac{2v v^\top u - 2v u^\top u - 2 u v^\top u + 2u u^\top u}{v^\top v - 2v^\top u + u^\top u} \\\\ &= u - \frac{vv^\top u - vu^\top u - u v^\top u + u u^\top u}{u^\top u - v^\top u} \\\\ &= u - \frac{(u - v)v^\top u - (u - v)u^\top u}{v^\top u - u^\top u} \\\\ &= u - (u - v) \frac{v^\top u - u^\top u}{v^\top u - u^\top u} \\\\ &= u - u + v \\\\ &= v \end{aligned}\]

Can you explain why Householder reflectors are called “reflectors”?


A reflection $H$ is a matrix where there exists an orthonormal change of basis matrix such that $Q$

\[\begin{aligned} H &= Q \begin{pmatrix} -1 & 0 & 0 & \cdots & 0 \\\\ 0 & 1 & 0 & \cdots & 0 \\\\ 0 & 0 & \ddots & \ddots & 0 \\\\ \vdots & \vdots & \ddots & 1 & 0 \\\\ 0 & 0 & 0 & 0 & 1 \end{pmatrix} Q^\top \end{aligned}\]

but any reflection can be written

\[\begin{aligned} H &= Q \begin{pmatrix} -1 & 0 & 0 & \cdots & 0 \\\\ 0 & 1 & 0 & \cdots & 0 \\\\ 0 & 0 & \ddots & \ddots & 0 \\\\ \vdots & \vdots & \ddots & 1 & 0 \\\\ 0 & 0 & 0 & 0 & 1 \end{pmatrix}Q^\top \\\\ &= Q \begin{pmatrix} 1 & 0 & 0 & \cdots & 0 \\\\ 0 & 1 & 0 & \cdots & 0 \\\\ 0 & 0 & \ddots & \ddots & 0 \\\\ \vdots & \vdots & \ddots & 1 & 0 \\\\ 0 & 0 & 0 & 0 & 1 \end{pmatrix} Q^\top - 2 Q \begin{pmatrix} 1 & 0 & 0 & \cdots & 0 \\\\ 0 & 0& 0 & \cdots & 0 \\\\ 0 & 0 & \ddots & \ddots & 0 \\\\ \vdots & \vdots & \ddots & 0 & 0 \\\\ 0 & 0 & 0 & 0 & 0 \end{pmatrix} Q^\top \\\\ &= I - 2ww^\top \end{aligned}\]

What is the flop cost of applying an $n \times n$ Householder reflector $H$ to some vector $v$ (including the dominant term)?


\[4n^2 + O(n)\]

Quick justification:

\[Hv = (I - 2ww^\top) v = v - w 2(w^\top v)\]

and $w^\top v$ is roughly $2n$ ($2n-1$ to be precise) flops, then $2(w^\top v)$ is $2n$ flops, then $w2(w^\top v)$ is $3n$ flops, and subtracting is $4n$ flops. Since this has to be done for each component, it is $4n^2$.

What is the flop cost of applying the QR algorithm to an $m \times n$ matrix (including the dominant terms)?


\[2mn^2 - \frac 2 3n^3 + O(\text{lower degree terms})\]

(if the matrix is square, this reduces to $4n^3/3$). Quick justification: if $m \ge n$ (the most likely case), then we are doing

\[\sum^{n-1}_{k =1} 4(m-k)(n-k) = \frac 4 3 mn^2 + O(\cdots)\]

work.




Related posts