SKILL-RAG: Self-Knowledge Induced Learning and Filtering for Retrieval-Augmented Generation
By: Tomoaki Isoda
Potential Business Impact:
Helps computers answer questions better by knowing what's useful.
Retrieval-Augmented Generation (RAG) has significantly improved the performance of large language models (LLMs) on knowledge-intensive tasks in recent years. However, since retrieval systems may return irrelevant content, incorporating such information into the model often leads to hallucinations. Thus, identifying and filtering out unhelpful retrieved content is a key challenge for improving RAG performance.To better integrate the internal knowledge of the model with external knowledge from retrieval, it is essential to understand what the model "knows" and "does not know" (which is also called "self-knowledge"). Based on this insight, we propose SKILL-RAG (Self-Knowledge Induced Learning and Filtering for RAG), a novel method that leverages the model's self-knowledge to determine which retrieved documents are beneficial for answering a given query. We design a reinforcement learning-based training framework to explicitly elicit self-knowledge from the model and employs sentence-level granularity to filter out irrelevant content while preserving useful knowledge.We evaluate SKILL-RAG using Llama2-7B and Qwen3-8B on several question answering benchmarks. Experimental results demonstrate that SKILL-RAG not only improves generation quality but also significantly reduces the number of input documents, validating the importance of self-knowledge in guiding the selection of high-quality retrievals.
Similar Papers
Careful Queries, Credible Results: Teaching RAG Models Advanced Web Search Tools with Reinforcement Learning
Information Retrieval
Filters bad web info for smarter AI answers.
Self-Routing RAG: Binding Selective Retrieval with Knowledge Verbalization
Computation and Language
Helps AI answer questions better, faster, and smarter.
Human Cognition Inspired RAG with Knowledge Graph for Complex Problem Solving
Machine Learning (CS)
Helps computers solve hard problems by thinking step-by-step.