One Pic is All it Takes: Poisoning Visual Document Retrieval Augmented Generation with a Single Image
By: Ezzeldin Shereen , Dan Ristea , Shae McFadden and more
Potential Business Impact:
Makes AI lie by tricking its memory.
Multi-modal retrieval augmented generation (M-RAG) is instrumental for inhibiting hallucinations in large multi-modal models (LMMs) through the use of a factual knowledge base (KB). However, M-RAG introduces new attack vectors for adversaries that aim to disrupt the system by injecting malicious entries into the KB. In this paper, we present the first poisoning attack against M-RAG targeting visual document retrieval applications where the KB contains images of document pages. We propose two attacks, each of which require injecting only a single adversarial image into the KB. Firstly, we propose a universal attack that, for any potential user query, influences the response to cause a denial-of-service (DoS) in the M-RAG system. Secondly, we present a targeted attack against one or a group of user queries, with the goal of spreading targeted misinformation. For both attacks, we use a multi-objective gradient-based adversarial approach to craft the injected image while optimizing for both retrieval and generation. We evaluate our attacks against several visual document retrieval datasets, a diverse set of state-of-the-art retrievers (embedding models) and generators (LMMs), demonstrating the attack effectiveness in both the universal and targeted settings. We additionally present results including commonly used defenses, various attack hyper-parameter settings, ablations, and attack transferability.
Similar Papers
Poisoned-MRAG: Knowledge Poisoning Attacks to Multimodal Retrieval Augmented Generation
Cryptography and Security
Makes AI show wrong answers by tricking its memory.
Practical Poisoning Attacks against Retrieval-Augmented Generation
Cryptography and Security
Makes AI smarter and harder to trick.
HV-Attack: Hierarchical Visual Attack for Multimodal Retrieval Augmented Generation
CV and Pattern Recognition
Tricks AI into giving wrong answers with hidden image changes.