Score: 1

O-Researcher: An Open Ended Deep Research Model via Multi-Agent Distillation and Agentic RL

Published: January 7, 2026 | arXiv ID: 2601.03743v1

By: Yi Yao , He Zhu , Piaohong Wang and more

Potential Business Impact:

Makes free AI smarter than paid AI.

Business Areas:

Natural Language Processing Artificial Intelligence, Data and Analytics, Software

The performance gap between closed-source and open-source large language models (LLMs) is largely attributed to disparities in access to high-quality training data. To bridge this gap, we introduce a novel framework for the automated synthesis of sophisticated, research-grade instructional data. Our approach centers on a multi-agent workflow where collaborative AI agents simulate complex tool-integrated reasoning to generate diverse and high-fidelity data end-to-end. Leveraging this synthesized data, we develop a two-stage training strategy that integrates supervised fine-tuning with a novel reinforcement learning method, designed to maximize model alignment and capability. Extensive experiments demonstrate that our framework empowers open-source models across multiple scales, enabling them to achieve new state-of-the-art performance on the major deep research benchmark. This work provides a scalable and effective pathway for advancing open-source LLMs without relying on proprietary data or models.