BudgetThinker: Empowering Budget-aware LLM Reasoning with Control Tokens
By: Hao Wen , Xinrui Wu , Yi Sun and more
Potential Business Impact:
Lets smart computer programs think faster, cheaper.
Recent advancements in Large Language Models (LLMs) have leveraged increased test-time computation to enhance reasoning capabilities, a strategy that, while effective, incurs significant latency and resource costs, limiting their applicability in real-world time-constrained or cost-sensitive scenarios. This paper introduces BudgetThinker, a novel framework designed to empower LLMs with budget-aware reasoning, enabling precise control over the length of their thought processes. We propose a methodology that periodically inserts special control tokens during inference to continuously inform the model of its remaining token budget. This approach is coupled with a comprehensive two-stage training pipeline, beginning with Supervised Fine-Tuning (SFT) to familiarize the model with budget constraints, followed by a curriculum-based Reinforcement Learning (RL) phase that utilizes a length-aware reward function to optimize for both accuracy and budget adherence. We demonstrate that BudgetThinker significantly surpasses strong baselines in maintaining performance across a variety of reasoning budgets on challenging mathematical benchmarks. Our method provides a scalable and effective solution for developing efficient and controllable LLM reasoning, making advanced models more practical for deployment in resource-constrained and real-time environments.
Similar Papers
Boosting Accuracy and Efficiency of Budget Forcing in LLMs via Reinforcement Learning for Mathematical Reasoning
Artificial Intelligence
Makes AI better at math, uses fewer words.
Steering LLM Thinking with Budget Guidance
Computation and Language
Makes smart computer thinking faster and cheaper.
SelfBudgeter: Adaptive Token Allocation for Efficient LLM Reasoning
Artificial Intelligence
Makes smart computers think less on easy problems.