Score: 1

Less is More: Improving LLM Reasoning with Minimal Test-Time Intervention

Published: October 15, 2025 | arXiv ID: 2510.13940v1

By: Zhen Yang , Mingyang Zhang , Feng Chen and more

Potential Business Impact:

Makes AI smarter and faster at solving problems.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

Recent progress in large language models (LLMs) has focused on test-time scaling to improve reasoning via increased inference computation, but often at the cost of efficiency. We revisit test-time behavior and uncover a simple yet underexplored phenomenon: reasoning uncertainty is highly localized-only a small subset of high-entropy tokens dominantly affects output correctness. Motivated by this, we propose Minimal Test-Time Intervention (MTI), a training-free framework that enhances reasoning accuracy and stability with minimal overhead. MTI includes: (i) Selective CFG intervention, applying classifier-free guidance only at uncertain positions; and (ii) Lightweight negative-prompt guidance, reusing the main model's KV cache to approximate unconditional decoding efficiently. MTI yields consistent gains across general, coding, and STEM tasks-e.g., +1.35% average improvement on eight benchmarks for Qwen3-8B-Base and +5% on AIME2024 using Qwen3-32B-Reasoning-while remaining highly efficient.