NLA MT25, Numerical stability


Flashcards

@Define $\text{fl}(X)$ in the context of numerical stability.


The computed, floating-point version of $X$.

@State bounds on $\text{fl}(x + y)$ and $\text{fl}(xy)$ where $x, y$ are scalars.


  • $\text{fl}(x + y) = x + y + c\epsilon$, where $ \vert c \vert \le \max( \vert x \vert , \vert y \vert )$
  • $\text{fl}(xy) = xy + c\epsilon$ where $ \vert c \vert \le \max( \vert xy \vert )$

Suppose we have a problem where we have $x$ and want to find $y = f(x)$. @Define the absolute conditioning $\kappa$ and the relative conditioning $\kappa _ r$ of the problem.


\[\kappa := \lim _ {\epsilon \to 0^+} \sup _ { \vert \vert \delta x \vert \vert < \epsilon} \frac{ \vert \vert f(x + \delta x) - f(x) \vert \vert }{ \vert \vert \delta x \vert \vert }\] \[\kappa _ r := \lim _ {\epsilon \to 0^+} \sup _ { \vert \vert \delta x \vert \vert < \epsilon} \frac{\frac{ \vert \vert f(x + \delta x) - f(x) \vert \vert }{ \vert \vert f(x) \vert \vert }}{\frac{ \vert \vert \delta x \vert \vert }{ \vert \vert x \vert \vert }}\]

Suppose we have a problem where we have $x$ and want to find $y = f(x)$. @Define what it means for $f$ to output a “backward stable” solution in this case.


The algorithm outputs $\hat y$ where $\hat y = f(x + \Delta x)$, where $\Delta x$ is small (we hope $ \vert \vert \Delta x \vert \vert / \vert \vert x \vert \vert = O(\epsilon)$, where $\epsilon$ is machine precision).

Suppose we have a problem where we have $x$ and want to find $y = f(x)$. @Define what it means for $f$ to be forward stable.


\[\frac{ \vert \vert y - \hat y \vert \vert }{ \vert \vert y \vert \vert } = O(\kappa(f) \epsilon)\]

@Define the condition number of a matrix $A$.


\[\kappa _ 2(A) = \frac{\sigma _ \max(A)}{\sigma _ \min(A)}\]

@State a theorem bounding the relative forward error of a backward stable solution to $Ax = b$ in terms of the condition number of $A$.


Suppose we have a backward stable solution for $Ax = b$ such that:

  • $(A + \Delta A)\hat x = b$
  • $ \vert \vert \Delta A \vert \vert \le \epsilon \vert \vert A \vert \vert $
  • $\kappa _ 2(A) \ll \epsilon^{-1}$

Then $ \vert \vert A^{-1} \Delta A \vert \vert \ll 1$ and we have

\[\frac{ \vert \vert \hat x - x \vert \vert }{ \vert \vert x \vert \vert } \le \epsilon \kappa _ 2(A) + O(\epsilon^2)\]

@Prove that if we have a backward stable solution for $Ax = b$ such that:

  • $(A + \Delta A)\hat x = b$
  • $ \vert \vert \Delta A \vert \vert \le \epsilon \vert \vert A \vert \vert $
  • $\kappa _ 2(A) \ll \epsilon^{-1}$

Then $ \vert \vert A^{-1} \Delta A \vert \vert \ll 1$ and we have

\[\frac{ \vert \vert \hat x - x \vert \vert }{ \vert \vert x \vert \vert } \le \epsilon \kappa _ 2(A) + O(\epsilon^2)\]

\[\begin{aligned} \vert \vert A^{-1} \Delta A \vert \vert &\le \vert \vert A^{-1} \vert \vert \cdot \vert \vert \Delta A \vert \vert \\ &\le \vert \vert A^{-1} \vert \vert \cdot \epsilon \vert \vert A \vert \vert \\ &= \epsilon \vert \vert A^{-1} \vert \vert \cdot \vert \vert A \vert \vert \\ &= \epsilon \frac{\sigma _ \max(A)}{\sigma _ \min(A)} \\ &\ll \epsilon \epsilon^{-1} \\ &= 1 \end{aligned}\]

Then by Neumann series, we obtain

\[\begin{aligned} (A + \Delta A)^{-1} &= (A(I + A^{-1} \Delta A))^{-1} \\ &= (I - A^{-1} \Delta A + O( \vert \vert A^{-1} \Delta A \vert \vert ^2)) A^{-1} \\ \end{aligned}\]

Therefore since

\[\begin{aligned} \hat x &= (A + \Delta A)^{-1}b \\ &= A^{-1}b - A^{-1} \Delta A A^{-1} b + O( \vert \vert A^{-1}\Delta A \vert \vert ^2) \\ &= x - A^{-1}\Delta Ax + O( \vert \vert A^{-1} \Delta A \vert \vert ^2) \end{aligned}\]

Hence

\[\begin{aligned} \vert \vert x - \hat x \vert \vert &\lesssim \vert \vert A^{-1} \Delta A x \vert \vert \\ &\le \vert \vert A^{-1} \vert \vert \, \vert \vert \Delta A \vert \vert \, \vert \vert x \vert \vert \\ &\le \epsilon \vert \vert A \vert \vert \, \vert \vert A^{-1} \vert \vert \, \vert \vert x \vert \vert \\ &= \epsilon \kappa _ 2(A) \vert \vert x \vert \vert \end{aligned}\]

as required.

(This effectively justifies why we define the condition number of a matrix like we do).

@Justify why if $Y = f(X)$ is computed in a backward stable manner, i.e.

\[\hat Y = f(X + \Delta X), \quad \vert \vert \Delta X \vert \vert = \epsilon\]

and we have the absolute conditioning $ \vert \vert f(X) - f(X + \Delta X) \vert \vert \lesssim \kappa \vert \vert \Delta X \vert \vert = \kappa \epsilon$, then

\[\vert \vert \hat Y - Y \vert \vert \lesssim \kappa \epsilon\]

\[\begin{aligned} \vert \vert \hat Y - Y \vert \vert &= \vert \vert f(X + \Delta X) - f(X) \vert \vert \\ &\lesssim \kappa \vert \vert \Delta X \vert \vert \\ &= \kappa \epsilon \end{aligned}\]
Stability of matrix and vector operations

@State bounds on $\text{fl}(y^\top x)$ and $\text{fl}(Ax)$ where $x, y$ are vectors and is a matrix, and then state a non-bound.


  • $\text{fl}(y^\top x) = (y + \Delta y)x$, and as a consequence
  • $\text{fl}(Ax) = (A + \Delta A)x$, but
  • $\text{fl}(AB) \ne \text{fl}(A + \Delta A)(B + \Delta B)$ (so in particular matrix multiplication is not backward stable)

@State a bound on the forward error of matrix multiplication.


Suppose:

  • , are matrices

Then:

\[\vert \vert \text{fl}(AB) - AB \vert \vert \le \epsilon \vert \vert A \vert \vert \, \vert \vert B \vert \vert\]

and so consequently

\[\frac{ \vert \vert \text{fl}(AB) - AB \vert \vert }{ \vert \vert AB \vert \vert } \le \epsilon \min(\kappa _ 2(A), \kappa _ 2(B))\]

@Prove that if $A$, $B$ are matrices, then we have a bound on the forward error of matrix multiplication:

\[\vert \vert \text{fl}(AB) - AB \vert \vert \le \epsilon \vert \vert A \vert \vert \, \vert \vert B \vert \vert\]

and so consequently

\[\frac{ \vert \vert \text{fl}(AB) - AB \vert \vert }{ \vert \vert AB \vert \vert } \le \epsilon \min(\kappa _ 2(A), \kappa _ 2(B))\]

@todo, once done problem sheet.

Given that if $A$, $B$ are matrices, then we have a bound ∆forward-error-matrix-multiplication on the forward error of matrix multiplication:

\[\vert \vert \text{fl}(AB) - AB \vert \vert \le \epsilon \vert \vert A \vert \vert \, \vert \vert B \vert \vert\]

@Prove that orthogonal multiplication is backward stable.


Suppose $A = Q$ is orthogonal, then we have

\[\vert \vert \text{fl}(QB) - QB \vert \vert \le \epsilon \vert \vert B \vert \vert\]

so it follows that $\text{fl}(QB) = QB + \epsilon \vert \vert B \vert \vert $, and so defining $\Delta B = Q^\top \epsilon \vert \vert B \vert \vert $, so $\text{fl}(QB) = Q(B + \Delta B)$.

Why are algorithms involving ill-conditioned matrices unstable, and what do several algorithms do to get around this?


Matrix multiplication is not backward stable, but orthogonal matrix multiplication is, so base your algorithm on orthogonal matrices.




Related posts