ob
  • about
  • blog
  • notes (current)
  • misc
  • explore
  • study
Home Notes University Notes Part C Courses MT25 Theories of Deep Learning Lectures Theories of Deep Learning MT25, II, Why deep learning Theories of Deep Learning MT25, III, Exponential expressivity with depth Theories of Deep Learning MT25, IV, Data classes for which DNNs can overcome the curse of dimensionality Theories of Deep Learning MT25, V, Controlling the exponential growth of variance and correlation Theories of Deep Learning MT25, VI, Controlling the variance of the Jacobian's spectrum Theories of Deep Learning MT25, VII, Stochastic gradient descent and its extensions
Theories of Deep Learning MT25, VIII, Optimisation algorithms for training DNNs Theories of Deep Learning MT25, XI, Visualising the filters and response in a CNN Theories of Deep Learning MT25, XII, The scattering transform and into auto-encoders Theories of Deep Learning MT25, XIII, Autoencoders Theories of Deep Learning MT25, XIV, Generative adversarial networks Theories of Deep Learning MT25, XV, A few things we missed and a summary Theories of Deep Learning MT25, XVI, Ingredients for a successful mini-project report

Lecture - Theories of Deep Learning MT25, VII, Stochastic gradient descent and its extensions

Created: November 14, 2025 | Updated: November 15, 2025 | Read markdown | About these notes


  • Course - Theories of Deep Learning MT25U

This lecture and the next (Lecture - Theories of Deep Learning MT25, VIII, Optimisation algorithms for training DNNsU) are effectively a mini-speedrun of Course - Optimisation for Data Science HT25U. In particular, this lecture covered results on the convergence of stochastic gradient descent and how to decrease the noise floor:

  • Notes - Optimisation for Data Science HT25, Stochastic gradient descentU
  • Notes - Optimisation for Data Science HT25, Stochastic variance reduction methodsU



Related posts
  • Course - Theories of Deep Learning MT25U
    (outgoing)
  • Lecture - Theories of Deep Learning MT25, VIII, Optimisation algorithms for training DNNsU
    (outgoing)
  • Lecture - Theories of Deep Learning MT25, XIII, AutoencodersU
    (sim: 0.7)
  • Lecture - Theories of Deep Learning MT25, V, Controlling the exponential growth of variance and correlationU
    (sim: 0.691)
  • Course - Optimisation for Data Science HT25U
    (outgoing)
  • Notes - Optimisation for Data Science HT25, Stochastic gradient descentU
    (outgoing)
  • Notes - Optimisation for Data Science HT25, Stochastic variance reduction methodsU
    (outgoing)
  • Lecture - Machine Learning MT23, XIVU
    (sim: 0.685)
  • Lecture - Machine Learning MT23, XVIU
    (sim: 0.667)
© Copyright 2026 Olly Britton. Last updated: May 19, 2026.