Score: 0

Collaborative LLM Inference via Planning for Efficient Reasoning

Published: June 13, 2025 | arXiv ID: 2506.11578v1

By: Byeongchan Lee , Jonghoon Lee , Dongyoung Kim and more

Potential Business Impact:

Lets free AI models solve hard problems together.

Business Areas:

Crowdsourcing Collaboration

Large language models (LLMs) excel at complex reasoning tasks, but those with strong capabilities (e.g., whose numbers of parameters are larger than 100B) are often accessible only through paid APIs, making them too costly for applications of frequent use. In contrast, smaller open-sourced LLMs (e.g., whose numbers of parameters are less than 3B) are freely available and easy to deploy locally (e.g., under a single GPU having 8G VRAM), but lack suff icient reasoning ability. This trade-off raises a natural question: can small (free) and large (costly) models collaborate at test time to combine their strengths? We propose a test-time collaboration framework in which a planner model first generates a plan, defined as a distilled and high-level abstraction of the problem. This plan serves as a lightweight intermediate that guides a reasoner model, which generates a complete solution. Small and large models take turns acting as planner and reasoner, exchanging plans in a multi-round cascade to collaboratively solve complex tasks. Our method achieves accuracy comparable to strong proprietary models alone, while significantly reducing reliance on paid inference. These results highlight planning as an effective prior for orchestrating cost-aware, cross-model inference under real-world deployment constraints.

Idea2Plan: Exploring AI-Powered Research Planning

Computation and Language

Helps computers plan science experiments from ideas.

28 Oct 2025 0

90%

LightPlanner: Unleashing the Reasoning Capabilities of Lightweight Large Language Models in Task Planning

Robotics

Helps robots plan complex tasks better.

11 Mar 2025 0

90%

Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning

Artificial Intelligence

Smarter AI answers questions faster, cheaper.

15 Oct 2025 1

View PDF Login to Bookmark

Page Count

16 pages

Collaborative LLM Inference via Planning for Efficient Reasoning

Lets free AI models solve hard problems together.

Technical Abstract

Idea2Plan: Exploring AI-Powered Research Planning

LightPlanner: Unleashing the Reasoning Capabilities of Lightweight Large Language Models in Task Planning

Adaptive Reasoning Executor: A Collaborative Agent System for Efficient Reasoning