Notes - Numerical Analysis HT24, Best approximation in inner product spaces
Most of the results in this section make it sound like they apply to only function spaces like $L^2 _ w(a, b)$, but really they apply to all inner product spaces. This means you can you this to derive results about e.g. weighted least squares by using the $\langle v - w, u \rangle$ characterisation of best approximations.
Flashcards
Define the vector space $L^2 _ w(a, b)$.
where $w : \mathbb R \to \mathbb R _ {>0}$.
What inner product $\langle f, g \rangle$ do you use when doing least-squares approximation?
When $w(x) = 1$, this is “unweighted” least squares.
Suppose we have a function $f \in L _ w^2(a, b)$ which we want to least-squares approximate with a polynomial $p _ n \in \Pi _ n$. What result gives you necessary and sufficient conditions for $p _ n$ to be the best approximation?
iff
\[||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]Suppose we have an element $v$ of $(V, \langle \cdot, \cdot \rangle)$, an inner product space, which we want to approximate using the norm induced by the inner product with an element $w$ of a subspace $W \le V$. What result gives you necessary and sufficient conditions for $u$ to be the best approximation to $v$ in this subspace?
iff
\[||v-w|| \le ||v-u|| \quad \forall u \in W\]Suppose:
- $f \in L _ w^2(a, b)$
- $p _ n \in \Pi _ n$
Quickly prove that
\[\langle f - p_n, r\rangle = 0 \quad \forall r \in \Pi_n\]
implies
\[\quad ||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]
Consider
\[\begin{aligned} ||f-p_n||^2_2 &= \langle f - p_n, f - p_n \rangle \\\\ &= \langle f - p_n, f - r\rangle + \langle f - p_n, r - p_n \rangle && (1\star)\\\\ &= \langle f - p_n, f-r\rangle && (2\star) \\\\ &\le ||f - p_n||_2 ||f - r||_2 &&(3\star) \end{aligned}\]which implies the result. $(1\star)$ is just adding and subtracting $r$, $(2\star)$ is using the hypothesis, and $(3\star)$ is the Cauchy-Schwarz inequality.
Suppose:
\[f \in L _ w^2(a, b)\]
and we are trying to approximate it by some polynomial function
\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]
Using the fact that
\[\langle f - p_n, r\rangle = 0 \quad \forall r \in \Pi_n \quad \implies \quad ||f - p_n||_2 \le ||f- r||_2 \quad \forall r \in \Pi_n\]
Quickly derive the “normal equations”, is a linear equation expressed using a matrix for some $\pmb \alpha$, the vector containing the coefficients.
By the fact, we want
\[\int^b_a w(x) \left(f(x) - \sum^n_{k = 0} \alpha_k x^k\right) r(x) \text d x = 0\]for all polynomials $r(x)$. But this is true iff
\[\int^b_a w(x) \left(f(x) - \sum^n_{k = 0} \alpha_k x^k\right) x^i \text d x = 0\]for all $i = 0, 1, \cdots, n$ (since we are just considering a particular basis). Then, rewriting
\[\begin{aligned} &\int^b_a w(x) f(x) x^i \text d x = \int^b_a \sum^n_{k = 0} \alpha_k x^k x^i \text d x \\\\ \implies & \sum^n_{k = 0} \left(\int^b_a w(x) x^{k+i} \text dx\right)\alpha_k = \int^b_a w(x) f(x) x^i \text d x \\\\ \implies & A\pmb \alpha = \pmb \phi \end{aligned}\]Where
\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\] \[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\] \[\pmb \phi_i = \int^b_a w(x) f(x) x^i \text dx\]Suppose:
\[f \in L _ w^2(a, b)\]
and we are trying to approximate it by some polynomial function
\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]
What are the normal equations?
where
\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\] \[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\] \[\pmb \phi_i = \int^b_a w(x) f(x) x^i \text dx\]Suppose:
\[f \in L _ w^2(a, b)\]
and we are trying to approximate it by some polynomial function
\[p_n(x) = \sum^n_{k = 0} \alpha_k x^k\]
Then we can form the normal equations for the coefficients, i.e.
\[A \pmb \alpha = \pmb \phi\]
where
\[(A)_{ik} := \int^b_a w(x) x^{k+i} \text d x\]
\[\pmb \alpha = (\alpha_0, \cdots, \alpha_n)^\top\]
\[\pmb \phi_i = \int^b_a w(x) f(x) x^i\]
Quickly prove that this matrix is always invertible and hence that the best approximation is unique.
Key idea: Consider $\pmb \alpha^\top A \pmb \alpha = 0$. Then expanding the definitions gives the result.
Suppose for a contradiction that it was singular. Then we have the following chain of deductions:
\[\begin{aligned} &\exists \pmb \alpha \ne 0 \text{ s.t. } A\pmb\alpha = 0 \\\\ \implies& \pmb \alpha^\top A \pmb \alpha = 0 \\\\ \iff& \sum^n_ {i = 0} \alpha_i (A\alpha)_ i = 0 \\\\ \iff& \sum^n_ {i = 0} \alpha_i \left(\sum^n_ {k = 0} A_ {ik} \alpha_k\right) \\\\ \iff& \sum^n_ {i =0} \alpha_i \left(\sum^n_ {j = 0}\left( \int^b_ a w(x) x^{i+j} \right)\alpha_ j\right) \\\\ \iff& \int^b_ a w(x) \left(\sum^n_ {i = 0} \alpha_ i x^i\right) \left(\sum^n_ {j = 0} \alpha_ j x^j\right) \text d x \\\\ \iff& \int^b_ a w(x) \left(\sum^n_ {i = 0} \alpha_i x^i\right)^2 \text d x \end{aligned}\]But then $\sum^n _ {i = 0} \alpha _ i x^i$ must be identically $0$, so $\pmb \alpha = \pmb 0$, a contradiction.
Suppose:
- $C[-1, 1]$ is the set of all real-valued continuous functions on $[-1, 1]$
- We want to find the best approximation to $u(x)$ with the inner product $\langle f, g \rangle = \int^1 _ {-1} f(x) \text dx$
- $u$ is an odd function
What’s the quick idea behind deducing that the best approximation will also be an odd function?
Note that the normal equations have a checkerboard-like structure, so you can rearrange to get one system equal to a nonzero vector, and the other equal to a zero vector.