Score: 2

Rank4Gen: RAG-Preference-Aligned Document Set Selection and Ranking

Published: January 16, 2026 | arXiv ID: 2601.11273v1

By: Yongqi Fan , Yuxiang Chu , Zhentao Xia and more

BigTech Affiliations: Tencent

Potential Business Impact:

Helps AI choose the best facts for better answers.

Business Areas:
Semantic Search Internet Services

In the RAG paradigm, the information retrieval module provides context for generators by retrieving and ranking multiple documents to support the aggregation of evidence. However, existing ranking models are primarily optimized for query--document relevance, which often misaligns with generators' preferences for evidence selection and citation, limiting their impact on response quality. Moreover, most approaches do not account for preference differences across generators, resulting in unstable cross-generator performance. We propose \textbf{Rank4Gen}, a generator-aware ranker for RAG that targets the goal of \emph{Ranking for Generators}. Rank4Gen introduces two key preference modeling strategies: (1) \textbf{From Ranking Relevance to Response Quality}, which optimizes ranking with respect to downstream response quality rather than query--document relevance; and (2) \textbf{Generator-Specific Preference Modeling}, which conditions a single ranker on different generators to capture their distinct ranking preferences. To enable such modeling, we construct \textbf{PRISM}, a dataset built from multiple open-source corpora and diverse downstream generators. Experiments on five challenging and recent RAG benchmarks demonstrate that RRank4Gen achieves strong and competitive performance for complex evidence composition in RAG.

Country of Origin
🇨🇳 China

Repos / Data Links

Page Count
25 pages

Category
Computer Science:
Information Retrieval