# Lecture - Theories of Deep Learning MT25, VII, Stochastic gradient descent and its extensions

> Source: https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/lectures/extensions/ · Updated: 2025-11-15 · Tags: uni, lecture

- [Course - Theories of Deep Learning MT25](https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/)

This lecture and the next ([Lecture - Theories of Deep Learning MT25, VIII, Optimisation algorithms for training DNNs](https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/lectures/dnns/)) are effectively a mini-speedrun of [Course - Optimisation for Data Science HT25](https://ollybritton.com/notes/uni/part-b/ht25/optimisation-for-data-science/). In particular, this lecture covered results on the convergence of stochastic gradient descent and how to decrease the noise floor:

- [Notes - Optimisation for Data Science HT25, Stochastic gradient descent](https://ollybritton.com/notes/uni/part-b/ht25/optimisation-for-data-science/notes/stochastic-gradient-descent/)
- [Notes - Optimisation for Data Science HT25, Stochastic variance reduction methods](https://ollybritton.com/notes/uni/part-b/ht25/optimisation-for-data-science/notes/stochastic-variance-reduction-methods/)

---
Olly Britton — https://ollybritton.com. Machine-readable index: https://ollybritton.com/llms.txt