Score: 2

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Published: August 7, 2025 | arXiv ID: 2508.05337v1

By: Jiameng Huang , Baijiong Lin , Guhao Feng and more

BigTech Affiliations: Huawei

Potential Business Impact:

Stops smart computers from thinking too much.

Recent Large Reasoning Language Models (LRLMs) employ long chain-of-thought reasoning with complex reflection behaviors, typically signaled by specific trigger words (e.g., "Wait" and "Alternatively") to enhance performance. However, these reflection behaviors can lead to the overthinking problem where the generation of redundant reasoning steps that unnecessarily increase token usage, raise inference costs, and reduce practical utility. In this paper, we propose Certainty-Guided Reflection Suppression (CGRS), a novel method that mitigates overthinking in LRLMs while maintaining reasoning accuracy. CGRS operates by dynamically suppressing the model's generation of reflection triggers when it exhibits high confidence in its current response, thereby preventing redundant reflection cycles without compromising output quality. Our approach is model-agnostic, requires no retraining or architectural modifications, and can be integrated seamlessly with existing autoregressive generation pipelines. Extensive experiments across four reasoning benchmarks (i.e., AIME24, AMC23, MATH500, and GPQA-D) demonstrate CGRS's effectiveness: it reduces token usage by an average of 18.5% to 41.9% while preserving accuracy. It also achieves the optimal balance between length reduction and performance compared to state-of-the-art baselines. These results hold consistently across model architectures (e.g., DeepSeek-R1-Distill series, QwQ-32B, and Qwen3 family) and scales (4B to 32B parameters), highlighting CGRS's practical value for efficient reasoning.

Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach

Artificial Intelligence

Makes smart computer thinking more accurate and faster.

9 Sep 2025 0

90%

Multi-chain Graph Refinement and Selection for Reliable Reasoning in Large Language Models

Computation and Language

Helps computers solve harder problems faster.

28 Nov 2025 1

90%

Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction

Artificial Intelligence

Helps AI fix its own thinking mistakes.

21 Dec 2025 0

View PDF Login to Bookmark

Country of Origin

🇨🇳 China

Page Count

9 pages

Efficient Reasoning for Large Reasoning Language Models via Certainty-Guided Reflection Suppression

Stops smart computers from thinking too much.

Technical Abstract

Certainty-Guided Reasoning in Large Language Models: A Dynamic Thinking Budget Approach

Multi-chain Graph Refinement and Selection for Reliable Reasoning in Large Language Models

Reflective Confidence: Correcting Reasoning Flaws via Online Self-Correction