Score: 1

RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models

Published: April 25, 2025 | arXiv ID: 2504.18041v1

By: Bang An, Shiyue Zhang, Mark Dredze

Potential Business Impact:

Makes AI that uses outside info less safe.

Business Areas:

Augmented Reality Hardware, Software

Efforts to ensure the safety of large language models (LLMs) include safety fine-tuning, evaluation, and red teaming. However, despite the widespread use of the Retrieval-Augmented Generation (RAG) framework, AI safety work focuses on standard LLMs, which means we know little about how RAG use cases change a model's safety profile. We conduct a detailed comparative analysis of RAG and non-RAG frameworks with eleven LLMs. We find that RAG can make models less safe and change their safety profile. We explore the causes of this change and find that even combinations of safe models with safe documents can cause unsafe generations. In addition, we evaluate some existing red teaming methods for RAG settings and show that they are less effective than when used for non-RAG settings. Our work highlights the need for safety research and red-teaming methods specifically tailored for RAG LLMs.

Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey

Computation and Language

Tests how AI uses outside facts to answer questions.

21 Apr 2025 0

93%

Retrieval Augmented Generation Evaluation for Health Documents

Information Retrieval

Helps doctors find important health info faster.

7 May 2025 1

92%

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation

Cryptography and Security

Helps computers spot new cyber threats faster.

31 Oct 2025 0

View PDF Login to Bookmark

Country of Origin

🇺🇸 United States

Repos / Data Links

github.com

Page Count

31 pages

RAG LLMs are Not Safer: A Safety Analysis of Retrieval-Augmented Generation for Large Language Models

Makes AI that uses outside info less safe.

Technical Abstract

Retrieval Augmented Generation Evaluation in the Era of Large Language Models: A Comprehensive Survey

Retrieval Augmented Generation Evaluation for Health Documents

Adapting Large Language Models to Emerging Cybersecurity using Retrieval Augmented Generation