Paper - Dynamics of Transient Structure in In-Context Linear Regression Transformers


Summary

  • in-context linear regression
    • given some task vector $t \in \mathbb R^D$, sample a fixed length distribution of pairs $(x _ i, y _ i)$ where $x _ i$ is normally distributed and $y _ i \sim \mathcal N(y _ i \mid t^\top x _ i, \sigma^2)$
    • vary the “task diversity” by picking different

Flashcards




Related posts