ROI-Reasoning: Rational Optimization for Inference via Pre-Computation Meta-Cognition
By: Muyang Zhao, Qi Qi, Hao Sun
Potential Business Impact:
Helps AI guess how hard a problem is.
Large language models (LLMs) can achieve strong reasoning performance with sufficient computation, but they do not inherently know how much computation a task requires. We study budgeted inference-time reasoning for multiple tasks under a strict global token constraint and formalize it as a Ordered Stochastic Multiple-Choice Knapsack Problem(OS-MCKP). This perspective highlights a meta-cognitive requirement -- anticipating task difficulty, estimating return over investment (ROI), and allocating computation strategically. We propose ROI-Reasoning, a two-stage framework that endows LLMs with intrinsic, budget-aware rationality. In the first stage, Meta-Cognitive Fine-Tuning teaches models to predict reasoning cost and expected utility before generation, enabling explicit solve-or-skip decisions. Next, Rationality-Aware Reinforcement Learning optimizes sequential decision making under a hard token budget, allowing models to learn long-horizon allocation strategies. Across budgeted mathematical reasoning benchmarks, ROI-Reasoning consistently improves overall score while substantially reducing regret under tight computation budgets.
Similar Papers
Meta-Reasoner: Dynamic Guidance for Optimized Inference-time Reasoning in Large Language Models
Artificial Intelligence
Helps computers solve problems smarter and faster.
Meta-R1: Empowering Large Reasoning Models with Metacognition
Artificial Intelligence
Makes AI think smarter and more carefully.
Correct, Concise and Complete: Multi-stage Training For Adaptive Reasoning
Computation and Language
Makes AI think less to solve problems faster.