Lecture - Theories of Deep Learning MT25, XV, A few things we missed and a summary


  • Course - Theories of Deep Learning MT25U

  • Dropout

  • Skip connections

  • Tokenisation

    • Tokens are probably different in a chat context vs a coding context
  • How sparse can you make your nets before losing loads of accuracy?

  • Major omissions

    • How well does depth improve generalisation error?