Lecture - Theories of Deep Learning MT25, XV, A few things we missed and a summary


  • [[Course - Theories of Deep Learning MT25]]U

  • Dropout
  • Skip connections
  • Tokenisation
    • Tokens are probably different in a chat context vs a coding context
  • How sparse can you make your nets before losing loads of accuracy?
  • Major omissions
    • How well does depth improve generalisation error?



Related posts