Score: 2

Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning

Published: October 11, 2025 | arXiv ID: 2510.10009v1

By: Shu Zhao, Tan Yu, Anbang Xu

BigTech Affiliations: NVIDIA

Potential Business Impact:

Helps computers answer harder questions by searching better.

Business Areas:

Semantic Search Internet Services

Reasoning-augmented search agents, such as Search-R1, are trained to reason, search, and generate the final answer iteratively. Nevertheless, due to their limited capabilities in reasoning and search, their performance on multi-hop QA benchmarks remains far from satisfactory. To handle complex or compound queries, we train an LLM-based search agent with the native capability of query expansion through reinforcement learning. In each turn, our search agent proposes several query variants, which are searched simultaneously to cover more relevant information. Meanwhile, given limited post-training data and computing resources, it is very challenging for a search agent to master multiple tasks, including query generation, retrieved information understanding, and answer generation. Therefore, we propose incorporating a pre-trained squeezer model that helps the search agent understand the retrieved documents, allowing the search agent to focus on query generation for high retrieval recall. With the assistance of the squeezer model, we discover that even a small-scale 3B LLM can demonstrate a strong capability of query expansion and achieve state-of-the-art accuracy on the multi-hop QA benchmarks. To be specific, our experiments across seven question-answering benchmarks demonstrate that our method, named ExpandSearch, achieves an average improvement of 4.4% compared to state-of-the-art baselines, with strong gains on multi-hop reasoning tasks requiring diverse evidence aggregation.

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

Computation and Language

Helps computers find better answers online.

12 Mar 2025 2

91%

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Artificial Intelligence

Lets computers find answers on the internet.

7 Mar 2025 2

90%

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning

Computation and Language

Teaches AI to learn and solve problems better.

18 Nov 2025 1

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Page Count

19 pages

Beyond the limitation of a single query: Train your LLM for query expansion with Reinforcement Learning

Helps computers answer harder questions by searching better.

Technical Abstract

Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning

R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning

Agent-R1: Training Powerful LLM Agents with End-to-End Reinforcement Learning