Notes - Numerical Analysis HT24, Best approximation in inner product spaces


Most of the results in this section make it sound like they apply to only function spaces like $L^2 _ w(a, b)$, but really they apply to all inner product spaces. This means you can you this to derive results about e.g. weighted least squares by using the $\langle v - w, u \rangle$ characterisation of best approximations.

Flashcards

Define the vector space $L^2 _ w(a, b)$.


\[\\{f : [a,b] \to \mathbb R \mid \int^b_a w(x) |f(x)|^2 \text d x < \infty\\}\]

where $w : \mathbb R \to \mathbb R _ {>0}$.

What inner product $\langle f, g \rangle$ do you use when doing least-squares approximation?


\[\langle f, g \rangle = \int^b_a w(x) f(x) g(x) \text d x\]

When $w(x) = 1$, this is “unweighted” least squares.

Suppose we have a function $f \in L _ w^2(a, b)$ which we want to least-squares approximate with a polynomial $p _ n \in \Pi _ n$. What result gives you necessary and sufficient conditions for $p _ n$ to be the best approximation?


\[\langle f - p_n, r\rangle = 0 \quad \forall r \in \Pi_n\]

iff

\[||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]

Suppose we have an element $v$ of $(V, \langle \cdot, \cdot \rangle)$, an inner product space, which we want to approximate using the norm induced by the inner product with an element $w$ of a subspace $W \le V$. What result gives you necessary and sufficient conditions for $u$ to be the best approximation to $v$ in this subspace?


\[\langle v - w, u\rangle = 0 \quad \forall u \in W\]

iff

\[||v-w|| \le ||v-u|| \quad \forall u \in W\]

Suppose:

  • $f \in L _ w^2(a, b)$
  • $p _ n \in \Pi _ n$

Quickly prove that

\[\langle f - p_n, r\rangle = 0 \quad \forall r \in \Pi_n\]

implies

\[\quad ||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]

Consider

\[\begin{aligned} ||f-p_n||^2_2 &= \langle f - p_n, f - p_n \rangle \\\\ &= \langle f - p_n, f - r\rangle + \langle f - p_n, r - p_n \rangle && (1\star)\\\\ &= \langle f - p_n, f-r\rangle && (2\star) \\\\ &\le ||f - p_n||_2 ||f - r||_2 &&(3\star) \end{aligned}\]

which implies the result. $(1\star)$ is just adding and subtracting $r$, $(2\star)$ is using the hypothesis, and $(3\star)$ is the Cauchy-Schwarz inequality.

Suppose:

\[f \in L _ w^2(a, b)\]

and we are trying to approximate it by some polynomial function

\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]

Using the fact that

\[\langle f - p_n, r\rangle = 0 \quad \forall r \in \Pi_n \quad \implies \quad ||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]

Quickly derive the “normal equations”, is a linear equation expressed using a matrix for some $\pmb \alpha$, the vector containing the coefficients.


By the fact, we want

\[\int^b_a w(x) \left(f(x) - \sum^n_{k = 0} \alpha_k x^k\right) r(x) \text d x = 0\]

for all polynomials $r(x)$. But this is true iff

\[\int^b_a w(x) \left(f(x) - \sum^n_{k = 0} \alpha_k x^k\right) x^i \text d x = 0\]

for all $i = 0, 1, \cdots, n$ (since we are just considering a particular basis). Then, rewriting

\[\begin{aligned} &\int^b_a w(x) f(x) x^i \text d x = \int^b_a \sum^n_{k = 0} \alpha_k x^k x^i \text d x \\\\ \implies & \sum^n_{k = 0} \left(\int^b_a w(x) x^{k+i} \text dx\right)\alpha_k = \int^b_a w(x) f(x) x^i \text d x \\\\ \implies & A\pmb \alpha = \pmb \phi \end{aligned}\]

Where

\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\] \[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\] \[\pmb \phi_i = \int^b_a w(x) f(x) x^i \text dx\]

Suppose:

\[f \in L _ w^2(a, b)\]

and we are trying to approximate it by some polynomial function

\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]

What are the normal equations?


\[A \pmb \alpha = \pmb \phi\]

where

\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\] \[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\] \[\pmb \phi_i = \int^b_a w(x) f(x) x^i \text dx\]

Suppose:

\[f \in L _ w^2(a, b)\]

and we are trying to approximate it by some polynomial function

\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]

Then we can form the normal equations for the coefficients, i.e.

\[A \pmb \alpha = \pmb \phi\]

where

\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\] \[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\] \[\pmb \phi_i = \int^b_a w(x) f(x) x^i\]

Quickly prove that this matrix is always invertible and hence that the best approximation is unique.


Key idea: Consider $\pmb \alpha^\top A \pmb \alpha = 0$. Then expanding the definitions gives the result.

Suppose for a contradiction that it was singular. Then we have the following chain of deductions:

\[\begin{aligned} &\exists \pmb \alpha \ne 0 \text{ s.t. } A\pmb\alpha = 0 \\\\ \implies& \pmb \alpha^\top A \pmb \alpha = 0 \\\\ \iff& \sum^n_ {i = 0} \alpha_i (A\alpha)_ i = 0 \\\\ \iff& \sum^n_ {i = 0} \alpha_i \left(\sum^n_ {k = 0} A_ {ik} \alpha_k\right) \\\\ \iff& \sum^n_ {i =0} \alpha_i \left(\sum^n_ {j = 0}\left( \int^b_ a w(x) x^{i+j} \right)\alpha_ j\right) \\\\ \iff& \int^b_ a w(x) \left(\sum^n_ {i = 0} \alpha_ i x^i\right) \left(\sum^n_ {j = 0} \alpha_ j x^j\right) \text d x \\\\ \iff& \int^b_ a w(x) \left(\sum^n_ {i = 0} \alpha_i x^i\right)^2 \text d x \end{aligned}\]

But then $\sum^n _ {i = 0} \alpha _ i x^i$ must be identically $0$, so $\pmb \alpha = \pmb 0$, a contradiction.

Suppose:

  • $C[-1, 1]$ is the set of all real-valued continuous functions on $[-1, 1]$
  • We want to find the best approximation to $u(x)$ with the inner product $\langle f, g \rangle = \int^1 _ {-1} f(x) \text dx$
  • $u$ is an odd function

What’s the quick idea behind deducing that the best approximation will also be an odd function?


Note that the normal equations have a checkerboard-like structure, so you can rearrange to get one system equal to a nonzero vector, and the other equal to a zero vector.




Related posts