ob
  • about
  • blog
  • notes (current)
  • misc
Notes University Notes Part C Courses MT25 Theories of Deep Learning Lectures 13 Reading 10 ADADELTA, An Adaptive Learning Rate Method Attention Is All You Need (2017) Deep, deep trouble, Elad Error bounds for approximations with deep ReLU networks, Yarotsky (2016) Explaining and harnessing adversarial examples (2015) Exponential expressivity in deep neural networks through transient chaos (2016) Gradient-based learning applied to document recognition, LeCun Optimal nonlinear approximation, DeVore (1989) Representation Benefits of Deep Feedforward Networks, Telgarsky (2015) When and when can deep networks avoid the curse of dimensionality, Poggio (2016) Vapnik-Chervonenkis dimension

Reading

Created: November 18, 2025 | Updated: November 18, 2025 | About these notes


  • [[Article - Deep, deep trouble, Elad]]U
  • [[Paper - ADADELTA, An Adaptive Learning Rate Method]]U
  • [[Paper - Attention Is All You Need (2017)]]U
  • [[Paper - Error bounds for approximations with deep ReLU networks, Yarotsky (2016)]]U
  • [[Paper - Explaining and harnessing adversarial examples (2015)]]U
  • [[Paper - Exponential expressivity in deep neural networks through transient chaos (2016)]]U
  • [[Paper - Gradient-based learning applied to document recognition, LeCun]]U
  • [[Paper - Optimal nonlinear approximation, DeVore (1989)]]U
  • [[Paper - Representation Benefits of Deep Feedforward Networks, Telgarsky (2015)]]U
  • [[Paper - When and when can deep networks avoid the curse of dimensionality, Poggio (2016)]]U
© Copyright 2026 Olly Britton. Last updated: May 16, 2026.