Score: 1

Rethinking Reflection in Pre-Training

Published: April 5, 2025 | arXiv ID: 2504.04022v1

By: Essential AI , : , Darsh J Shah and more

Potential Business Impact:

Computers learn to fix their own mistakes.

Business Areas:
Natural Language Processing Artificial Intelligence, Data and Analytics, Software

A language model's ability to reflect on its own reasoning provides a key advantage for solving complex problems. While most recent research has focused on how this ability develops during reinforcement learning, we show that it actually begins to emerge much earlier - during the model's pre-training. To study this, we introduce deliberate errors into chains-of-thought and test whether the model can still arrive at the correct answer by recognizing and correcting these mistakes. By tracking performance across different stages of pre-training, we observe that this self-correcting ability appears early and improves steadily over time. For instance, an OLMo2-7B model pre-trained on 4 trillion tokens displays self-correction on our six self-reflection tasks.

Repos / Data Links

Page Count
38 pages

Category
Computer Science:
Computation and Language