Article - Do reasoning models use their scratchpad like we do? (2025)
- Full title: Do reasoning models use their scratchpad like we do? Evidence from distilling paraphrases
- Author(s): Fabien Roger
- Year: 2025
- Link: https://alignment.anthropic.com/2025/distill-paraphrases/
- Relevant for:
Summary
- Worry is that models might improve performance through encoded reasoning or Chain-of-Thought steganography
- Distilling paraphrased reasoning traces back into a non-reasoning model improves the pass@1 accuracy to roughly the same as the pass@1 accuracy of the original reasoning model, indicating that the model is not using encoded reasoning to squeeze out extra performance from its scratchpad