Optimisation for Data Science HT25, Optimisation terminology


Flashcards

Minimisers

Consider the optimisation model

\[\min _ {x \in \mathbb R^n} f(x) \quad\text{ subject to }x \in \mathcal F\]

@Define what is meant by:

  • A local minimiser
  • A strict local minimiser
  • A global minimiser

$x^\ast \in \mathcal F$ is:

  • A local minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon (x^\ast)$ for some $\varepsilon > 0$.
  • A strict local minimiser if $f(x^\ast) < f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon(x^\ast)$
  • A global minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F$.

Suppose $f : \mathbb R^n \to \mathbb R \cup \{+\infty\}$ is a proper convex function. Consider the optimisation model

\[\min _ {x \in \mathbb R^n} f(x) \quad\text{ subject to }x \in \mathcal F\]

Recall that $x^\ast \in \mathcal F$ is:

  • A local minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon (x^\ast)$ for some $\varepsilon > 0$.
  • A strict local minimiser if $f(x^\ast) < f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon(x^\ast)$
  • A global minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F$.

@State three results about the minimisation of proper convex functions.


(1): $x^\ast$ is a local minimiser iff it is a global minimiser.


(2): The set of minimisers

\[\text{argmin} _ {x \in \mathbb R^n} f(x) := \\{x^\ast \mid f(x^\ast) = \min _ {x \in \mathbb R^n} f(x)\\}\]

is a convex set.


(3): $x^\ast \in \text{argmin} _ {x \in \mathbb R^n} f(x)$ if and only if $\pmb 0 \in \partial f(x^\ast)$.

Suppose $f : \mathbb R^n \to \mathbb R \cup \{+\infty\}$ is a proper convex function. Consider the optimisation model

\[\min _ {x \in \mathbb R^n} f(x) \quad\text{ subject to }x \in \mathcal F\]

Recall that $x^\ast \in \mathcal F$ is:

  • A local minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon (x^\ast)$ for some $\varepsilon > 0$.
  • A strict local minimiser if $f(x^\ast) < f(x)$ for all $x \in \mathcal F \cap B _ \varepsilon(x^\ast)$
  • A global minimiser if $f(x^\ast) \le f(x)$ for all $x \in \mathcal F$.

@Prove the following three results about the minimisation of proper convex functions:

  1. $x^\ast$ is a local minimiser iff it is a global minimiser.
  2. The set of minimisers $\text{argmin} _ {x \in \mathbb R^n} f(x) := \{x^\ast \mid f(x^\ast) = \min _ {x \in \mathbb R^n} f(x)\}$ is a convex set.
  3. $x^\ast \in \text{argmin} _ {x \in \mathbb R^n} f(x)$ if and only if $\pmb 0 \in \partial f(x^\ast)$.

For (1), note that every global minimiser is in particular a local minimiser.

Now assume there exist $x$ such that $x$ is a local minimiser but not a global minimiser. Then $\exists \varepsilon > 0$ such that $\forall x’ \in B _ \varepsilon(x)$ such that $f(x) \le f(x’)$, but there is some $y$ such that $f(y) < f(x)$.

Let

\[x_\varepsilon := \left(1 - \frac{\varepsilon}{2||y - x||}\right)x + \left( \frac{\varepsilon}{2||y - x||} \right)y\]

and note that $x _ \varepsilon \in B _ \varepsilon (x)$. But then we have

\[f(x_\varepsilon) \le \left(1 - \frac{\varepsilon}{2||y - x||}\right)f(x) + \left( \frac{\varepsilon}{2||y - x||} \right)f(y) < f(x)\]

and so $x$ is not a local minimiser, a contradiction.


(2, 3) are immediate from chasing definitions.

Convergence rates

Suppose:

  • $(\phi _ k) _ {k \in \mathbb N} \in \mathbb R _ +^\mathbb N$
  • $\phi _ k \to 0$ as $k \to \infty$.

@Define what it means for $(\phi _ k) _ {k \in \mathbb N}$ to converge:

  • Sublinearly
  • $R$-linearly
  • $Q$-linearly
  • $R$-superlinearly
  • $Q$-superlinearly
  • $Q$-quadratically

and state the implications between these convergence rates. Which rates guarantee decrease at every iteration?


  • Sublinearly: Slower than $R$-linearly
  • $R$-linearly: $\exists \sigma \in (0, 1)$ and $C > 0$ such that $\forall k \in \mathbb N$, $\phi _ k \le C(1 - \sigma)^k$.
  • $Q$-linearly: $\exists \sigma \in (0, 1)$ such that $\forall k \in \mathbb N$, $\frac{\phi _ {k+1} }{\phi _ k} \le 1 - \sigma$ (guarantees decrease)
  • $Q$-superlinearly: $\lim _ {k \to \infty} \frac{\phi _ {k+1} }{\phi _ k} = 0$
  • $R$-superlinearly: $\exists (\nu _ k) _ {k \in \mathbb N} \in \mathbb R _ +^{\mathbb N}$ such that $\nu _ k \to 0$ $Q$-superlinearly and $\forall k \in \mathbb N$, $\phi _ k \le \nu _ k$
  • $Q$-quadratically: $\exists C > 0$ such that $\forall k \in \mathbb N$, $\frac{\phi _ {k+1} }{\phi _ k^2} \le C$.

All the following implications are strictly one directional:

  • $R$-linearity is implied by
  • $Q$-linearity is implied by
  • $R$-superlinearity is implied by
  • $Q$-superlinearity is implied by
  • $Q$-quadraticity



Related posts