GINGER: Grounded Information Nugget-Based Generation of Responses
By: Weronika Łajewska, Krisztian Balog
Potential Business Impact:
Makes AI answers more truthful and shows where they came from.
Retrieval-augmented generation (RAG) faces challenges related to factual correctness, source attribution, and response completeness. To address them, we propose a modular pipeline for grounded response generation that operates on information nuggets-minimal, atomic units of relevant information extracted from retrieved documents. The multistage pipeline encompasses nugget detection, clustering, ranking, top cluster summarization, and fluency enhancement. It guarantees grounding in specific facts, facilitates source attribution, and ensures maximum information inclusion within length constraints. Extensive experiments on the TREC RAG'24 dataset evaluated with the AutoNuggetizer framework demonstrate that GINGER achieves state-of-the-art performance on this benchmark.
Similar Papers
Test-time Corpus Feedback: From Retrieval to RAG
Information Retrieval
Lets computers ask better questions to find answers.
Test-time Corpus Feedback: From Retrieval to RAG
Information Retrieval
Makes AI smarter by letting it ask more questions.
A Survey on Knowledge-Oriented Retrieval-Augmented Generation
Computation and Language
Lets computers use outside facts to answer questions.