O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL
By: Yi Yao , He Zhu , Piaohong Wang and more
Potential Business Impact:
Makes free AI smarter than paid AI.
The performance gap between closed-source and open-source large language models (LLMs) is largely attributed to disparities in access to high-quality training data. To bridge this gap, we introduce a novel framework for the automated synthesis of sophisticated, research-grade instructional data. Our approach centers on a multi-agent workflow where collaborative AI agents simulate complex tool-integrated reasoning to generate diverse and high-fidelity data end-to-end. Leveraging this synthesized data, we develop a two-stage training strategy that integrates supervised fine-tuning with a novel reinforcement learning method, designed to maximize model alignment and capability. Extensive experiments demonstrate that our framework empowers open-source models across multiple scales, enabling them to achieve new state-of-the-art performance on the major deep research benchmark. This work provides a scalable and effective pathway for advancing open-source LLMs without relying on proprietary data or models.
Similar Papers
Deep Research: A Survey of Autonomous Research Agents
Information Retrieval
Helps AI find and use information from the internet.
Deep Research: A Systematic Survey
Computation and Language
Lets computers research and answer hard questions.
An Open and Reproducible Deep Research Agent for Long-Form Question Answering
Computation and Language
Helps computers answer hard questions by searching and thinking.