Towards Global Retrieval Augmented Generation: A Benchmark for Corpus-Level Reasoning
By: Qi Luo , Xiaonan Li , Tingshuo Fan and more
Potential Business Impact:
Helps computers find answers across many documents.
Retrieval-augmented generation (RAG) has emerged as a leading approach to reducing hallucinations in large language models (LLMs). Current RAG evaluation benchmarks primarily focus on what we call local RAG: retrieving relevant chunks from a small subset of documents to answer queries that require only localized understanding within specific text chunks. However, many real-world applications require a fundamentally different capability -- global RAG -- which involves aggregating and analyzing information across entire document collections to derive corpus-level insights (for example, "What are the top 10 most cited papers in 2023?"). In this paper, we introduce GlobalQA -- the first benchmark specifically designed to evaluate global RAG capabilities, covering four core task types: counting, extremum queries, sorting, and top-k extraction. Through systematic evaluation across different models and baselines, we find that existing RAG methods perform poorly on global tasks, with the strongest baseline achieving only 1.51 F1 score. To address these challenges, we propose GlobalRAG, a multi-tool collaborative framework that preserves structural coherence through chunk-level retrieval, incorporates LLM-driven intelligent filters to eliminate noisy documents, and integrates aggregation modules for precise symbolic computation. On the Qwen2.5-14B model, GlobalRAG achieves 6.63 F1 compared to the strongest baseline's 1.51 F1, validating the effectiveness of our method.
Similar Papers
Structured RAG for Answering Aggregative Questions
Computation and Language
Helps computers answer questions using many documents.
GlobalRAG: Enhancing Global Reasoning in Multi-hop Question Answering via Reinforcement Learning
Computation and Language
Helps computers answer hard questions by planning steps.
KERAG: Knowledge-Enhanced Retrieval-Augmented Generation for Advanced Question Answering
Computation and Language
Helps AI answer questions more accurately using more facts.