Optimizing Reasoning Efficiency through Prompt Difficulty Prediction
By: Bo Zhao , Berkcan Kapusuzoglu , Kartik Balasubramaniam and more
Potential Business Impact:
Smarter AI uses less power to solve hard problems.
Reasoning language models perform well on complex tasks but are costly to deploy due to their size and long reasoning traces. We propose a routing approach that assigns each problem to the smallest model likely to solve it, reducing compute without sacrificing accuracy. Using intermediate representations from s1.1-32B, we train lightweight predictors of problem difficulty or model correctness to guide routing across a pool of reasoning models. On diverse math benchmarks, routing improves efficiency over random assignment and matches s1.1-32B's performance while using significantly less compute. Our results demonstrate that difficulty-aware routing is effective for cost-efficient deployment of reasoning models.
Similar Papers
Light-IF: Endowing LLMs with Generalizable Reasoning via Preview and Self-Checking for Complex Instruction Following
Computation and Language
Teaches computers to follow tricky instructions better.
Training Language Models to Reason Efficiently
Machine Learning (CS)
Makes smart computer programs think faster, cheaper.
Confidence-Guided Stepwise Model Routing for Cost-Efficient Reasoning
Computation and Language
Saves computer power by smartly choosing which AI to use.