Score: 0

Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach

Published: September 9, 2025 | arXiv ID: 2509.07820v1

By: João Paulo Nogueira , Wentao Sun , Alonso Silva and more

Potential Business Impact:

Makes smart computer thinking more accurate and faster.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The rise of large reasoning language models (LRLMs) has unlocked new potential for solving complex tasks. These models operate with a thinking budget, that is, a predefined number of reasoning tokens used to arrive at a solution. We propose a novel approach, inspired by the generator/discriminator framework in generative adversarial networks, in which a critic model periodically probes its own reasoning to assess whether it has reached a confident conclusion. If not, reasoning continues until a target certainty threshold is met. This mechanism adaptively balances efficiency and reliability by allowing early termination when confidence is high, while encouraging further reasoning when uncertainty persists. Through experiments on the AIME2024 and AIME2025 datasets, we show that Certainty-Guided Reasoning (CGR) improves baseline accuracy while reducing token usage. Importantly, extended multi-seed evaluations over 64 runs demonstrate that CGR is stable, reducing variance across seeds and improving exam-like performance under penalty-based grading. Additionally, our token savings analysis shows that CGR can eliminate millions of tokens in aggregate, with tunable trade-offs between certainty thresholds and efficiency. Together, these findings highlight certainty as a powerful signal for reasoning sufficiency. By integrating confidence into the reasoning process, CGR makes large reasoning language models more adaptive, trustworthy, and resource efficient, paving the way for practical deployment in domains where both accuracy and computational cost matter.

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Computation and Language

Stops smart computers from thinking too much.

7 Aug 2025 2

90%

Don't Think Twice! Over-Reasoning Impairs Confidence Calibration

Artificial Intelligence

Makes AI more honest about what it knows.

20 Aug 2025 1

89%

Learning to Trust the Crowd: A Multi-Model Consensus Reasoning Engine for Large Language Models

Artificial Intelligence

Makes AI answers more truthful and correct.

12 Jan 2026 0

View PDF Login to Bookmark

Country of Origin

🇫🇷 France

Page Count

13 pages

Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach

Makes smart computer thinking more accurate and faster.

Technical Abstract

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Don't Think Twice! Over-Reasoning Impairs Confidence Calibration

Learning to Trust the Crowd: A Multi-Model Consensus Reasoning Engine for Large Language Models