Uncertainty in Deep Learning MT25, Introduction


Introduction

Modern machine learning systems are now sufficiently capable that they make decisions in high-impact domains, like:

  • Medicine: automated decision making or recommendation systems
  • Automotive: autonomous control of drones and self-driving cards
  • High-frequency trading: ability to affect economic markets at a global scale

but errors in these domains can cause real harm. For this reason, we need to make models that know when they don’t know and generate not only accurate predictions, but also quantified uncertainty.

Sources of uncertainty

There are many different places uncertainty appears in machine learning contexts:

  • Model uncertainty:
    • A model is defined by its parameters
    • There are a large number of possible models that can explain a dataset
    • Uncertain which model parameters to choose to predict with
    • A type of epistemic uncertainty
  • Data uncertainty
    • There might be a distribution shift between the training inputs and the test inputs
    • We should know when a model is extrapolating outside of the distribution it was trained on
    • Another type of epistemic uncertainty

Both of these types of uncertainty are reducible with more data. But there is also

  • Data uncertainty
    • We have measurement noise, label noise, or disagreement among annotators
    • A type of aleatoric uncertainty
    • Even with an infinite amount of data, predictions will still be probabilistic

Aleatoric uncertainty is not reducible with more data.

Standard machine learning techniques fail to capture these types of uncertainty: while a classifier can output a softmax vector, this reflects relative probabilities between classes and not a true depiction of the epistemic certainty of the network. Neural networks can’t say “I don’t know”.

Course structure

Over the next few weeks, we will learn:

  1. A formal language of uncertainty, in the form of Bayesian probability theory
  2. Modelling tools to express uncertainty in ML, in the form of Bayesian probabilistic modelling
  3. Scalable inference techniques for real-world deep learning systems
  4. Develop deep learning systems which convey uncertainty



Related posts