Score: 1

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Published: November 26, 2025 | arXiv ID: 2511.21689v1

By: Hongjin Su , Shizhe Diao , Ximing Lu and more

Potential Business Impact:

Smart computer programs solve hard problems cheaper.

Business Areas:

Artificial Intelligence Artificial Intelligence, Data and Analytics, Science and Engineering, Software

Large language models are powerful generalists, yet solving deep and complex problems such as those of the Humanity's Last Exam (HLE) remains both conceptually challenging and computationally expensive. We show that small orchestrators managing other models and a variety of tools can both push the upper bound of intelligence and improve efficiency in solving difficult agentic tasks. We introduce ToolOrchestra, a method for training small orchestrators that coordinate intelligent tools. ToolOrchestra explicitly uses reinforcement learning with outcome-, efficiency-, and user-preference-aware rewards. Using ToolOrchestra, we produce Orchestrator, an 8B model that achieves higher accuracy at lower cost than previous tool-use agents while aligning with user preferences on which tools are to be used for a given query. On HLE, Orchestrator achieves a score of 37.1%, outperforming GPT-5 (35.1%) while being 2.5x more efficient. On tau2-Bench and FRAMES, Orchestrator surpasses GPT-5 by a wide margin while using only about 30% of the cost. Extensive analysis shows that Orchestrator achieves the best trade-off between performance and cost under multiple metrics, and generalizes robustly to unseen tools. These results demonstrate that composing diverse tools with a lightweight orchestration model is both more efficient and more effective than existing methods, paving the way for practical and scalable tool-augmented reasoning systems.

Beyond Monoliths: Expert Orchestration for More Capable, Democratic, and Safe Large Language Models

Computers and Society

Lets many small AI models work together better.

28 May 2025 0

88%

Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration

Software Engineering

Makes AI write better computer code, faster.

1 Oct 2025 2

88%

AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving

Artificial Intelligence

Lets many computer helpers work together to solve problems.

14 Jun 2025 0

View PDF Login to Bookmark

Repos / Data Links

huggingface.co

Page Count

21 pages

ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration

Smart computer programs solve hard problems cheaper.

Technical Abstract

Beyond Monoliths: Expert Orchestration for More Capable, Democratic, and Safe Large Language Models

Beyond Single LLMs: Enhanced Code Generation via Multi-Stage Performance-Guided LLM Orchestration

AgentOrchestra: A Hierarchical Multi-Agent Framework for General-Purpose Task Solving