# Lecture - Theories of Deep Learning MT25, VI, Controlling the variance of the Jacobian's spectrum

> Source: https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/lectures/spectrum/ · Updated: 2025-11-20 · Tags: uni, lecture

[Course - Theories of Deep Learning MT25](https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/)

- Now instead looking at the spectrum of the Jacobian of the network on initialisation; this was motivated by empirical results showing that the spectrum of the Jacobian has strong effects on how easy the network is to train.
- Results from random matrix theory can be used to calculate the distribution of this spectrum.

### Papers mentioned
- [Paper - Exponential expressivity in deep neural networks through transient chaos (2016)](https://ollybritton.com/notes/uni/part-c/mt25/theories-of-deep-learning/reading/paper-exponential-expressivity-in-deep-neural-networks-through-transient-chaos-2016/)
- [redacted](https://ollybritton.com/404)
- [redacted](https://ollybritton.com/404)

### Further associated reading
- Identifying natural depth scales of information propagation: https://arxiv.org/pdf/1611.01232.pdf
- Further details on the role of activation functions: https://arxiv.org/pdf/1902.06853.pdf
- Principles for selecting activation functions: https://arxiv.org/pdf/2105.07741.pdf
- Early results on correlation of inputs (Chapter 2 in particular): https://www.cs.toronto.edu/~radford/ftp/thesis.pdf
- Rigorous treatment of Gaussian Process perspective, infinite: https://arxiv.org/pdf/1711.00165.pdf
- Rigorous treatment of Gaussian Process perspective, finite: https://arxiv.org/pdf/1804.11271.pdf
- Higher order terms and width proportional to depth scaling: https://arxiv.org/pdf/2106.10165.pdf
- Specifics for random ReLU nets:
	- https://arxiv.org/pdf/1801.03744.pdf
	- https://arxiv.org/pdf/1803.01719.pdf

---
Olly Britton — https://ollybritton.com. Machine-readable index: https://ollybritton.com/llms.txt
