LANCER: LLM Reranking for Nugget Coverage
By: Jia-Huei Ju , François G. Landry , Eugene Yang and more
Potential Business Impact:
Helps computers write longer, more complete reports.
Unlike short-form retrieval-augmented generation (RAG), such as factoid question answering, long-form RAG requires retrieval to provide documents covering a wide range of relevant information. Automated report generation exemplifies this setting: it requires not only relevant information but also a more elaborate response with comprehensive information. Yet, existing retrieval methods are primarily optimized for relevance ranking rather than information coverage. To address this limitation, we propose LANCER, an LLM-based reranking method for nugget coverage. LANCER predicts what sub-questions should be answered to satisfy an information need, predicts which documents answer these sub-questions, and reranks documents in order to provide a ranked list covering as many information nuggets as possible. Our empirical results show that LANCER enhances the quality of retrieval as measured by nugget coverage metrics, achieving higher $α$-nDCG and information coverage than other LLM-based reranking methods. Our oracle analysis further reveals that sub-question generation plays an essential role.
Similar Papers
LURE-RAG: Lightweight Utility-driven Reranking for Efficient RAG
Information Retrieval
Helps AI answer questions better by ordering information.
SciRerankBench: Benchmarking Rerankers Towards Scientific Retrieval-Augmented Generated LLMs
Computation and Language
Helps computers find correct science answers faster.
Cooperative Retrieval-Augmented Generation for Question Answering: Mutual Information Exchange and Ranking by Contrasting Layers
Computation and Language
Helps computers answer questions more truthfully.