Benchmarking Poisoning Attacks against Retrieval-Augmented Generation
By: Baolei Zhang , Haoran Xin , Jiatong Li and more
Potential Business Impact:
Protects smart AI from being tricked.
Retrieval-Augmented Generation (RAG) has proven effective in mitigating hallucinations in large language models by incorporating external knowledge during inference. However, this integration introduces new security vulnerabilities, particularly to poisoning attacks. Although prior work has explored various poisoning strategies, a thorough assessment of their practical threat to RAG systems remains missing. To address this gap, we propose the first comprehensive benchmark framework for evaluating poisoning attacks on RAG. Our benchmark covers 5 standard question answering (QA) datasets and 10 expanded variants, along with 13 poisoning attack methods and 7 defense mechanisms, representing a broad spectrum of existing techniques. Using this benchmark, we conduct a comprehensive evaluation of all included attacks and defenses across the full dataset spectrum. Our findings show that while existing attacks perform well on standard QA datasets, their effectiveness drops significantly on the expanded versions. Moreover, our results demonstrate that various advanced RAG architectures, such as sequential, branching, conditional, and loop RAG, as well as multi-turn conversational RAG, multimodal RAG systems, and RAG-based LLM agent systems, remain susceptible to poisoning attacks. Notably, current defense techniques fail to provide robust protection, underscoring the pressing need for more resilient and generalizable defense strategies.
Similar Papers
RAG Safety: Exploring Knowledge Poisoning Attacks to Retrieval-Augmented Generation
Cryptography and Security
Makes smart computers tricked by bad information.
Practical Poisoning Attacks against Retrieval-Augmented Generation
Cryptography and Security
Makes AI smarter and harder to trick.
CPA-RAG:Covert Poisoning Attacks on Retrieval-Augmented Generation in Large Language Models
Cryptography and Security
Makes smart AI answer questions wrongly on purpose.