Score: 1

DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation

Published: June 2, 2025 | arXiv ID: 2506.01954v1

By: Jennifer Chen , Aidar Myrzakhan , Yaxin Luo and more

Potential Business Impact:

Makes small AI smarter and more truthful.

Business Areas:

DRM Content and Publishing, Media and Entertainment, Privacy and Security

Retrieval-Augmented Generation (RAG) methods have proven highly effective for tasks requiring factual consistency and robust knowledge retrieval. However, large-scale RAG systems consume significant computational resources and are prone to generating hallucinated content from Humans. In this work, we introduce $\texttt{DRAG}$, a novel framework for distilling RAG knowledge from large-scale Language Models (LLMs) into small LMs (SLMs). Our approach leverages evidence- and knowledge graph-based distillation, ensuring that the distilled model retains critical factual knowledge while significantly reducing model size and computational cost. By aligning the smaller model's predictions with a structured knowledge graph and ranked evidence, $\texttt{DRAG}$ effectively mitigates hallucinations and improves factual accuracy. We further present a case demonstrating how our framework mitigates user privacy risks and introduce a corresponding benchmark. Experimental evaluations on multiple benchmarks demonstrate that our method outperforms the prior competitive RAG methods like MiniRAG for SLMs by up to 27.7% using the same models, preserving high-level efficiency and reliability. With $\texttt{DRAG}$, we provide a practical and resource-efficient roadmap to deploying enhanced retrieval and generation capabilities in small-sized LLMs.

Distributed Retrieval-Augmented Generation

Distributed, Parallel, and Cluster Computing

Shares private health info safely for better AI.

1 May 2025 1

92%

Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation

Information Retrieval

Makes AI doctors more truthful and accurate.

30 Mar 2025 0

92%

Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations

Computation and Language

Helps computers understand different facts better.

7 Apr 2025 1

View PDF Login to Bookmark

Repos / Data Links

github.com

Page Count

21 pages

DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation

Makes small AI smarter and more truthful.

Technical Abstract

Distributed Retrieval-Augmented Generation

Hyper-RAG: Combating LLM Hallucinations using Hypergraph-Driven Retrieval-Augmented Generation

Improving Multilingual Retrieval-Augmented Language Models through Dialectic Reasoning Argumentations