Optimisation for Data Science HT25, Motivation and examples


Flashcards

General setup

@State:

  • The general setup of a data analysis problem,
  • What a loss function typically looks like for a data fitting problem, and
  • Some examples of how you could interpret different data analysis problems in this framework

  • General setup:
    • Data set $D = \{(a _ j, y _ j) \mid j = 1, \ldots, m\} \subseteq V \times W$ where $V$ is a vector space of features and $y _ j$ is a space of observations
    • Parametric model: $\phi(a; x) : V \to W$, a feature observation relation parameterised by a vector $x \in \mathbb R^n$.
  • Typical loss function:
    • Want to find $x \in \mathbb R^n$ such that $\phi(a _ j; x) \approx y _ j$ for each $j$, solve $\min _ {x \in \mathbb R^n} f(x)$ where
    • $f(x) = \frac 1 m \sum^m _ {j = 1} \ell(a _ j, y _ j; x)$
  • Interpretations:
    • Regression: $W = \mathbb R$
    • Classification: $W = \{1, \ldots, M\}$
    • Clustering, dimensionality reduction: $W = \emptyset$.

Regression

Can you formalise regression with an intercept as a data analysis problem in the standard framework where:

  • General setup:
    • Data set $D = \{(a _ j, y _ j) \mid j = 1, \ldots, m\} \subseteq V \times W$ where $V$ is a vector space of features and $y _ j$ is a space of observations
    • Parametric model: $\phi(a; x) : V \to W$, a feature observation relation parameterised by a vector $x \in \mathbb R^n$.
  • Typical loss function:
    • Want to find $x \in \mathbb R^n$ such that $\phi(a _ j; x) \approx y _ j$ for each $j$, solve $\min _ {x \in \mathbb R^n} f(x)$ where
    • $f(x) = \frac 1 m \sum^m _ {j = 1} \ell(a _ j, y _ j; x)$

$V = \mathbb R^{n}$, $W = \mathbb R$ and $\phi(a; x) = a^\top \tilde x$ and have objective

\[\min_{x \in \mathbb R^n} \frac{1}{2m} \sum^m_{j = 1} (\tilde a_j^\top \tilde x - y_j)^2 = \frac{1}{2m} ||A\tilde x - y||^2\]

where $\tilde x = {x \choose \beta}$.

@example~

Dictionary learning

Matrix completion

PCA

Data separation

Multiclass classification




Related posts